Agent Web Access & Gateway
Ce contenu n’est pas encore disponible dans votre langue.
What this is
Section titled “What this is”Source isolation jails the agent runner: --network internal, no open egress, no platform credentials. The side effect is that the agent can’t reach the web or any tool a tenant connects either. This subsystem is the other half of that story — four server-mediated capabilities that give an isolated agent a real, audited reach to the outside world without re-opening the jail.
The shared shape (the convergent reference pattern: don’t open the sandbox, broker the outbound through the control plane): the isolated runner only requests a server-side action over the existing loopback bridge (papi.py → the in-task shim → the host platform API). The server makes the real outbound call; the sandbox network is never opened, and no key ever enters the runner.
All four lanes ship default-OFF / byte-identical when off. The mechanisms are merged and reviewed; arming each is a deliberate, separate flip. Shipped ≠ armed — be precise about which is which (the status table below).
The four lanes
Section titled “The four lanes”| Lane | ADR | What the agent gets | Status |
|---|---|---|---|
| Web-fetch broker | 0026 | fetch (URL→text), search (discovery), ingest (propose a doc for human approval) |
Shipped, default-OFF |
| Content-trust / injection defense | 0027 | Fetched content arrives wrapped as untrusted DATA, screened + annotated — never stripped | Shipped, default-OFF (rides the broker flag) |
| AI gateway + credential vault | 0024 | A use-but-can’t-read credential path so SaaS-direct adapters regain an LLM under isolation | Skeleton + wiring + Class-A merged, default-OFF; Class-A not end-to-end live |
| MCP / tool proxy | 0028 | Server-mediated reach for the MCP servers / tools a tenant connects in the app | Increment-1 shipped, default-OFF (PR #639) |
The web-fetch broker (ADR 0026)
Section titled “The web-fetch broker (ADR 0026)”A server-side capability the agent requests. It runs in the platform process (where the SSRF guard, the egress gate, and the business-library write path already live), not in the runner — so it works under the --network internal jail with zero change to the sandbox network. The broker’s egress is the server’s egress, governed by the eu-west-3 residency invariant.
- Two read verbs + one human-gated verb.
fetch <url>/search <query>return text transiently into the run context.ingest <url|doc>routes content into the human-gated business-library PUT path — an agent proposes, a human approves. An agent can never silently write external content into shared tenant knowledge. - Free-first backend ladder. Default backend is the in-VPC SSRF primitive + Scrapling — free, in-process, no vendor, no key, EU-resident by construction (no sub-processor). Escalates to a disclosed EU vendor only when a page needs it: Firecrawl-EU (JS/PDF), Exa (search), Tavily (search fallback).
- EU residency is a config-load invariant. A non-EU backend is refused at config-load. Vendor backends draw from monthly-rotated key pools; the pool stores a
sha256fingerprint, never the key. - URL policy. Any URL, GET only, same-domain shallow crawl (bounded by depth + page count + byte ceiling); off-domain link-following is refused.
- Per-fetch audit (Art. 30 artifact). One row per call through the egress gate (
run_id,tenant_id, verb, backend, target host, bytes, pages, redactions, outcome) — server-derived tenant, eu-west-3. A fetch whose audit can’t be written is refused (fail-CLOSED on the audit; fail-soft on content truncation). - Bridge body cap + chunking. Results return over the loopback bridge, whose single-message ceiling is 256 KB (
BRIDGE_BODY_CAP_BYTESinserver/src/services/web-fetch/types.ts); by default the broker clamps the returned body with a truncation marker. Chunking is a shipped, default-OFF flag (PR #644), not just a clamp: withAIMETIER_SANDBOX_BRIDGE_CHUNKING_ENABLEDon, the host splits a large body into ordered chunks the runner reassembles, lifting the cap up to a hard total ceilingAIMETIER_SANDBOX_BRIDGE_MAX_TOTAL_BYTES(default 8 MB; a body over it is a clean502, never silently truncated). The broker’s image-bearing tierCHUNKED_FETCH_MAX_BYTES(7 MB) sits under that ceiling. Both flags default-OFF / byte-identical (off = exactly the 256 KB clamp).
Content-trust & injection defense (ADR 0027)
Section titled “Content-trust & injection defense (ADR 0027)”Fetched content is DATA the agent reads, never instructions. The honest framing, repeated in the ADR: this layer detects, structures, and annotates — it does not remove or sanitize prompt injection. Reliably stripping injection from arbitrary web text is an open research problem, and over-scrubbing is the symmetric failure (a legit brand blog or security article quoting ignore previous instructions must round-trip byte-identical). No PR, PRD, or marketing copy may claim it “removes” injection.
The real defense is structural and was already shipped: the no-egress sandbox, broker-as-sole-egress, SSRF + per-company allow-list on every crawl hop, outbound PII redaction, Bash-as-boundary, the per-agent tool allow-list, and the fail-closed audit. The content layers are defense-in-depth on top. The split: content layers fail OPEN (annotate, return the page, log the skip — never blind discovery or drop a legit page); only the structural layers (egress, audit, ingest-promotion) fail CLOSED.
| Layer | Action | Fail-mode |
|---|---|---|
| L-WALL (structural) | The real defense, already shipped | fail-CLOSED |
| L0 nonce wrapper | Wrap as untrusted DATA + provenance line; strip only a forged delimiter | fail-OPEN |
| L1 heuristic screen | Free in-VPC pre-pass; flags injection-shaped spans, annotate-only — never strips or blocks | fail-OPEN |
| L1b outbound re-gate | Re-gate a content-derived follow-up fetch (closes the one-hop exfil) — a wall |
fail-CLOSED |
| L1c ingest quarantine | Fetched bytes never auto-promote to curated context — a wall | fail-CLOSED |
| L2 prompt policy | Static managed block: content is DATA, embedded instructions are to be reported not obeyed | n/a (biases, doesn’t enforce) |
| L3 Bedrock-Haiku judge | Optional LLM judge, annotate-only; default-OFF, arming deferred | fail-OPEN |
The verdict audit stores a verdict only (score, rule IDs, span offsets, content sha256) — never the matched raw substring or page body. External moderation APIs were rejected outright (a new sub-processor + cross-border transfer with no lawful basis under the eu-west-3 GDPR invariant). The permanent residual, named not hidden: a capable model is an influenceable model; a --yolo CLI has no hard in-band wall, so safety rests on the structural wall.
The AI gateway + credential vault (ADR 0024)
Section titled “The AI gateway + credential vault (ADR 0024)”Under source isolation, the SaaS provider keys (Cursor, Copilot, OpenAI, Gemini) are dropped entirely from the runner env — so those adapters have no LLM under isolation today; only the keyless Bedrock path (claude_local) works. The gateway + vault closes that gap with the Anthropic Managed-Agents use-but-can’t-read pattern: the runner carries an opaque placeholder, the real key is attached server-side at egress for that credential’s allow-listed hosts only. LLM access is restored via egress-substitution / a per-run virtual credential — never by un-stripping the keys back into the runner env.
- What’s merged (all default-OFF): the gateway skeleton (PR #637), the consumption-wiring into runner-spawn + a fail-closed
gateway_audit_log(PR #640), and the host-side Class-A proxy-client (PR #645) — all behindAIMETIER_LLM_GATEWAY_ENABLED(master) + the subordinateAIMETIER_LLM_GATEWAY_CLASSA_PROXY_ENABLED, both default-OFF, byte-identical when off. - Class-B is the live-when-armed path: for provider-native CLIs (cursor, copilot), a per-run virtual-credential shim vends a short-lived credential into the CLI’s config, mapped to the real key at egress, revoked at run end.
- Class-A is wired but NOT end-to-end live: for base-URL-routable adapters (codex/openai, gemini), the host-side proxy-client is merged, but the TLS-terminating forward proxy that decrypts the runner’s provider TLS to inject the header is the documented remaining piece (a TODO). So Class-A is not end-to-end live even when its flag is on.
- The audit log is synchronous + fail-closed:
gateway_audit_log(migration 0223) is a single synchronous INSERT appended atomically with each gateway action — if the audit write fails, the gateway refuses the request rather than serve it unlogged. It is not the async usage emitter. - Trust model (owner-signed, ADR 0024 §Amendment 2026-06-25): TLS substitution = jailed-proxy TLS-termination (accepted as named — the honest claim is “only the jailed egress proxy reads the key, in-flight, never the agent code”); the gateway holds no standing credential (the server server-proxies each vault read); the routing library runs in a keyless RPC child. The master flag stays OFF; arming is a separate prod-tfvars-only PR.
The MCP / tool proxy (ADR 0028) — increment-1 shipped, default-OFF
Section titled “The MCP / tool proxy (ADR 0028) — increment-1 shipped, default-OFF”The companion gap to web-reading: a tenant can connect an MCP server or network tool in the app, but the isolated agent can’t reach it at run time. ADR 0028 generalizes the fetch broker’s shape to “call a tool on a tenant-connected MCP server” — the server makes the outbound MCP call on the runner’s behalf over the loopback bridge.
Increment-1 is SHIPPED default-OFF (PR #639); arming deferred. The proxy is merged on origin/main behind AIMETIER_MCP_PROXY_ENABLED (default-OFF, mcp_proxy_enabled="false" pinned in accounts/dev.tfvars, byte-identical when off — both verbs 404). It exposes two agent-facing verbs over the loopback bridge — papi.py mcp-tools (list the agent’s granted user-connected MCP tools) and papi.py mcp-call (call one granted tool; the server makes the outbound call). Code: server/src/routes/mcp-proxy.ts + server/src/services/mcp-proxy/. (The existing server/src/routes/mcp-server-out.ts is the opposite direction — exposing an AImetier agent as an MCP server to external clients — and is unrelated.) Shipped ≠ armed: the flag is OFF and the proxy is not armed on the live tenant. The locked invariants live in the shipped code: the secret stays server-side (never vended into the runner); SSRF re-validated + IP-pinned per call inside mcp-client.ts rpc() (DNS-rebinding TOCTOU closed, no redirect follow); grant-gated via the per-agent allow-list (ADR 0022, reused via tool-builder.ts + wrapWithScopeGate; connected MCP tools are enabledByDefault:false); per-call server-derived-tenant audit; MCP results inherit ADR 0027’s untrusted-content framing. The reason arming is deferred: a connected MCP server may live outside eu-west-3, making it a cross-border transfer + disclosed sub-processor — how the residency invariant applies is a deliberate open question that, with a DPA disclosure, must be settled before arming on a live tenant.
Sub-processor register + CI gate
Section titled “Sub-processor register + CI gate”A vendor backend (Firecrawl-EU / Exa / Tavily) is a disclosed sub-processor, so the broker can’t quietly add one in code without the disclosure landing with it. A sub-processor register (docs/compliance/sub-processors.md) + a CI gate (scripts/check-subprocessor-register.mjs, a job in .github/workflows/deploy.yml, PR #643) blocks arming a vendor backend unless it’s disclosed — the gate fails closed on an unknown backend.
Flags (all default-OFF / SAFE)
Section titled “Flags (all default-OFF / SAFE)”As pinned in infra/terraform/accounts/dev.tfvars:
| Flag (ECS tfvar) | Default | Live pin | Arms |
|---|---|---|---|
web_fetch_broker_enabled |
false |
false |
The whole web-fetch broker (everything below is inert while off) |
web_fetch_backend |
in-vpc |
in-vpc |
Which backend fetch uses; a non-EU value is refused at config-load |
web_fetch_ladder |
false |
false |
The free-first fetch ladder (Scrapling → in-VPC → vendor) |
web_fetch_secrets_enabled |
false |
false |
Lets the broker read the vendor key pools |
web_fetch_injection_screen |
true |
true |
The free L1 in-VPC heuristic screen (annotate-only); inert while the broker is off |
web_fetch_injection_judge_enabled |
false |
false |
The paid L3 Bedrock-Haiku judge — arming deferred |
llm_gateway_enabled |
false |
false |
The gateway master flag; off = gateway never constructed, keys stay stripped |
llm_gateway_classa_proxy_enabled |
false |
false |
Class-A egress proxy-client (subordinate); even when on, NOT live until the TLS terminator ships |
mcp_proxy_enabled |
false |
false |
The user-connected MCP/tool proxy (ADR 0028, PR #639); off = both verbs 404. Arming deferred pending the residency / sub-processor decision |
sandbox_bridge_chunking_enabled |
false |
false |
Loopback-bridge response chunking (PR #644); off = the 256 KB single-message clamp, on = bodies up to sandbox_bridge_max_total_bytes (8 MB) |
runner_egress_hardened |
false |
true |
The structural egress jail these lanes sit on (ADR 0024 §0) — the one already-armed floor |
The corresponding AIMETIER_* env vars are wired through all three deploy paths (my_setup/.env{,.cloud}.template, docker/docker-compose.ai-agency.yml, infra/terraform/modules/ecs-cluster/{main.tf,variables.tf}).
Where to read further
Section titled “Where to read further”| Concern | Doc |
|---|---|
| Runtime deep-dive (verbs, ladder, layers, gateway, MCP) | my_setup/docs/aimetier-explained/web-access.html |
| The jail these lanes sit on | my_setup/docs/aimetier-explained/source-isolation.html |
| Fetch broker design + locked invariants | ADR 0026 |
| Content-trust / detect-not-remove | ADR 0027 |
| Gateway + credential vault | ADR 0024 |
| MCP/tool proxy (increment-1 shipped default-OFF) | ADR 0028 |
| End-user view (“look things up + use your tools, safely”) | Web and connected tools |