Aller au contenu

Agent Web Access & Gateway

Ce contenu n’est pas encore disponible dans votre langue.

Source isolation jails the agent runner: --network internal, no open egress, no platform credentials. The side effect is that the agent can’t reach the web or any tool a tenant connects either. This subsystem is the other half of that story — four server-mediated capabilities that give an isolated agent a real, audited reach to the outside world without re-opening the jail.

The shared shape (the convergent reference pattern: don’t open the sandbox, broker the outbound through the control plane): the isolated runner only requests a server-side action over the existing loopback bridge (papi.py → the in-task shim → the host platform API). The server makes the real outbound call; the sandbox network is never opened, and no key ever enters the runner.

All four lanes ship default-OFF / byte-identical when off. The mechanisms are merged and reviewed; arming each is a deliberate, separate flip. Shipped ≠ armed — be precise about which is which (the status table below).

Lane ADR What the agent gets Status
Web-fetch broker 0026 fetch (URL→text), search (discovery), ingest (propose a doc for human approval) Shipped, default-OFF
Content-trust / injection defense 0027 Fetched content arrives wrapped as untrusted DATA, screened + annotated — never stripped Shipped, default-OFF (rides the broker flag)
AI gateway + credential vault 0024 A use-but-can’t-read credential path so SaaS-direct adapters regain an LLM under isolation Skeleton + wiring + Class-A merged, default-OFF; Class-A not end-to-end live
MCP / tool proxy 0028 Server-mediated reach for the MCP servers / tools a tenant connects in the app Increment-1 shipped, default-OFF (PR #639)

A server-side capability the agent requests. It runs in the platform process (where the SSRF guard, the egress gate, and the business-library write path already live), not in the runner — so it works under the --network internal jail with zero change to the sandbox network. The broker’s egress is the server’s egress, governed by the eu-west-3 residency invariant.

  • Two read verbs + one human-gated verb. fetch <url> / search <query> return text transiently into the run context. ingest <url|doc> routes content into the human-gated business-library PUT path — an agent proposes, a human approves. An agent can never silently write external content into shared tenant knowledge.
  • Free-first backend ladder. Default backend is the in-VPC SSRF primitive + Scrapling — free, in-process, no vendor, no key, EU-resident by construction (no sub-processor). Escalates to a disclosed EU vendor only when a page needs it: Firecrawl-EU (JS/PDF), Exa (search), Tavily (search fallback).
  • EU residency is a config-load invariant. A non-EU backend is refused at config-load. Vendor backends draw from monthly-rotated key pools; the pool stores a sha256 fingerprint, never the key.
  • URL policy. Any URL, GET only, same-domain shallow crawl (bounded by depth + page count + byte ceiling); off-domain link-following is refused.
  • Per-fetch audit (Art. 30 artifact). One row per call through the egress gate (run_id, tenant_id, verb, backend, target host, bytes, pages, redactions, outcome) — server-derived tenant, eu-west-3. A fetch whose audit can’t be written is refused (fail-CLOSED on the audit; fail-soft on content truncation).
  • Bridge body cap + chunking. Results return over the loopback bridge, whose single-message ceiling is 256 KB (BRIDGE_BODY_CAP_BYTES in server/src/services/web-fetch/types.ts); by default the broker clamps the returned body with a truncation marker. Chunking is a shipped, default-OFF flag (PR #644), not just a clamp: with AIMETIER_SANDBOX_BRIDGE_CHUNKING_ENABLED on, the host splits a large body into ordered chunks the runner reassembles, lifting the cap up to a hard total ceiling AIMETIER_SANDBOX_BRIDGE_MAX_TOTAL_BYTES (default 8 MB; a body over it is a clean 502, never silently truncated). The broker’s image-bearing tier CHUNKED_FETCH_MAX_BYTES (7 MB) sits under that ceiling. Both flags default-OFF / byte-identical (off = exactly the 256 KB clamp).

Content-trust & injection defense (ADR 0027)

Section titled “Content-trust & injection defense (ADR 0027)”

Fetched content is DATA the agent reads, never instructions. The honest framing, repeated in the ADR: this layer detects, structures, and annotates — it does not remove or sanitize prompt injection. Reliably stripping injection from arbitrary web text is an open research problem, and over-scrubbing is the symmetric failure (a legit brand blog or security article quoting ignore previous instructions must round-trip byte-identical). No PR, PRD, or marketing copy may claim it “removes” injection.

The real defense is structural and was already shipped: the no-egress sandbox, broker-as-sole-egress, SSRF + per-company allow-list on every crawl hop, outbound PII redaction, Bash-as-boundary, the per-agent tool allow-list, and the fail-closed audit. The content layers are defense-in-depth on top. The split: content layers fail OPEN (annotate, return the page, log the skip — never blind discovery or drop a legit page); only the structural layers (egress, audit, ingest-promotion) fail CLOSED.

Layer Action Fail-mode
L-WALL (structural) The real defense, already shipped fail-CLOSED
L0 nonce wrapper Wrap as untrusted DATA + provenance line; strip only a forged delimiter fail-OPEN
L1 heuristic screen Free in-VPC pre-pass; flags injection-shaped spans, annotate-only — never strips or blocks fail-OPEN
L1b outbound re-gate Re-gate a content-derived follow-up fetch (closes the one-hop exfil) — a wall fail-CLOSED
L1c ingest quarantine Fetched bytes never auto-promote to curated context — a wall fail-CLOSED
L2 prompt policy Static managed block: content is DATA, embedded instructions are to be reported not obeyed n/a (biases, doesn’t enforce)
L3 Bedrock-Haiku judge Optional LLM judge, annotate-only; default-OFF, arming deferred fail-OPEN

The verdict audit stores a verdict only (score, rule IDs, span offsets, content sha256) — never the matched raw substring or page body. External moderation APIs were rejected outright (a new sub-processor + cross-border transfer with no lawful basis under the eu-west-3 GDPR invariant). The permanent residual, named not hidden: a capable model is an influenceable model; a --yolo CLI has no hard in-band wall, so safety rests on the structural wall.

The AI gateway + credential vault (ADR 0024)

Section titled “The AI gateway + credential vault (ADR 0024)”

Under source isolation, the SaaS provider keys (Cursor, Copilot, OpenAI, Gemini) are dropped entirely from the runner env — so those adapters have no LLM under isolation today; only the keyless Bedrock path (claude_local) works. The gateway + vault closes that gap with the Anthropic Managed-Agents use-but-can’t-read pattern: the runner carries an opaque placeholder, the real key is attached server-side at egress for that credential’s allow-listed hosts only. LLM access is restored via egress-substitution / a per-run virtual credential — never by un-stripping the keys back into the runner env.

  • What’s merged (all default-OFF): the gateway skeleton (PR #637), the consumption-wiring into runner-spawn + a fail-closed gateway_audit_log (PR #640), and the host-side Class-A proxy-client (PR #645) — all behind AIMETIER_LLM_GATEWAY_ENABLED (master) + the subordinate AIMETIER_LLM_GATEWAY_CLASSA_PROXY_ENABLED, both default-OFF, byte-identical when off.
  • Class-B is the live-when-armed path: for provider-native CLIs (cursor, copilot), a per-run virtual-credential shim vends a short-lived credential into the CLI’s config, mapped to the real key at egress, revoked at run end.
  • Class-A is wired but NOT end-to-end live: for base-URL-routable adapters (codex/openai, gemini), the host-side proxy-client is merged, but the TLS-terminating forward proxy that decrypts the runner’s provider TLS to inject the header is the documented remaining piece (a TODO). So Class-A is not end-to-end live even when its flag is on.
  • The audit log is synchronous + fail-closed: gateway_audit_log (migration 0223) is a single synchronous INSERT appended atomically with each gateway action — if the audit write fails, the gateway refuses the request rather than serve it unlogged. It is not the async usage emitter.
  • Trust model (owner-signed, ADR 0024 §Amendment 2026-06-25): TLS substitution = jailed-proxy TLS-termination (accepted as named — the honest claim is “only the jailed egress proxy reads the key, in-flight, never the agent code”); the gateway holds no standing credential (the server server-proxies each vault read); the routing library runs in a keyless RPC child. The master flag stays OFF; arming is a separate prod-tfvars-only PR.

The MCP / tool proxy (ADR 0028) — increment-1 shipped, default-OFF

Section titled “The MCP / tool proxy (ADR 0028) — increment-1 shipped, default-OFF”

The companion gap to web-reading: a tenant can connect an MCP server or network tool in the app, but the isolated agent can’t reach it at run time. ADR 0028 generalizes the fetch broker’s shape to “call a tool on a tenant-connected MCP server” — the server makes the outbound MCP call on the runner’s behalf over the loopback bridge.

Increment-1 is SHIPPED default-OFF (PR #639); arming deferred. The proxy is merged on origin/main behind AIMETIER_MCP_PROXY_ENABLED (default-OFF, mcp_proxy_enabled="false" pinned in accounts/dev.tfvars, byte-identical when off — both verbs 404). It exposes two agent-facing verbs over the loopback bridge — papi.py mcp-tools (list the agent’s granted user-connected MCP tools) and papi.py mcp-call (call one granted tool; the server makes the outbound call). Code: server/src/routes/mcp-proxy.ts + server/src/services/mcp-proxy/. (The existing server/src/routes/mcp-server-out.ts is the opposite direction — exposing an AImetier agent as an MCP server to external clients — and is unrelated.) Shipped ≠ armed: the flag is OFF and the proxy is not armed on the live tenant. The locked invariants live in the shipped code: the secret stays server-side (never vended into the runner); SSRF re-validated + IP-pinned per call inside mcp-client.ts rpc() (DNS-rebinding TOCTOU closed, no redirect follow); grant-gated via the per-agent allow-list (ADR 0022, reused via tool-builder.ts + wrapWithScopeGate; connected MCP tools are enabledByDefault:false); per-call server-derived-tenant audit; MCP results inherit ADR 0027’s untrusted-content framing. The reason arming is deferred: a connected MCP server may live outside eu-west-3, making it a cross-border transfer + disclosed sub-processor — how the residency invariant applies is a deliberate open question that, with a DPA disclosure, must be settled before arming on a live tenant.

A vendor backend (Firecrawl-EU / Exa / Tavily) is a disclosed sub-processor, so the broker can’t quietly add one in code without the disclosure landing with it. A sub-processor register (docs/compliance/sub-processors.md) + a CI gate (scripts/check-subprocessor-register.mjs, a job in .github/workflows/deploy.yml, PR #643) blocks arming a vendor backend unless it’s disclosed — the gate fails closed on an unknown backend.

As pinned in infra/terraform/accounts/dev.tfvars:

Flag (ECS tfvar) Default Live pin Arms
web_fetch_broker_enabled false false The whole web-fetch broker (everything below is inert while off)
web_fetch_backend in-vpc in-vpc Which backend fetch uses; a non-EU value is refused at config-load
web_fetch_ladder false false The free-first fetch ladder (Scrapling → in-VPC → vendor)
web_fetch_secrets_enabled false false Lets the broker read the vendor key pools
web_fetch_injection_screen true true The free L1 in-VPC heuristic screen (annotate-only); inert while the broker is off
web_fetch_injection_judge_enabled false false The paid L3 Bedrock-Haiku judge — arming deferred
llm_gateway_enabled false false The gateway master flag; off = gateway never constructed, keys stay stripped
llm_gateway_classa_proxy_enabled false false Class-A egress proxy-client (subordinate); even when on, NOT live until the TLS terminator ships
mcp_proxy_enabled false false The user-connected MCP/tool proxy (ADR 0028, PR #639); off = both verbs 404. Arming deferred pending the residency / sub-processor decision
sandbox_bridge_chunking_enabled false false Loopback-bridge response chunking (PR #644); off = the 256 KB single-message clamp, on = bodies up to sandbox_bridge_max_total_bytes (8 MB)
runner_egress_hardened false true The structural egress jail these lanes sit on (ADR 0024 §0) — the one already-armed floor

The corresponding AIMETIER_* env vars are wired through all three deploy paths (my_setup/.env{,.cloud}.template, docker/docker-compose.ai-agency.yml, infra/terraform/modules/ecs-cluster/{main.tf,variables.tf}).

Concern Doc
Runtime deep-dive (verbs, ladder, layers, gateway, MCP) my_setup/docs/aimetier-explained/web-access.html
The jail these lanes sit on my_setup/docs/aimetier-explained/source-isolation.html
Fetch broker design + locked invariants ADR 0026
Content-trust / detect-not-remove ADR 0027
Gateway + credential vault ADR 0024
MCP/tool proxy (increment-1 shipped default-OFF) ADR 0028
End-user view (“look things up + use your tools, safely”) Web and connected tools