THE AGENT ACTION FIREWALL REPORT

How many agents can take an irreversible action without a receipt?

The market is wiring AI agents into systems that move money, change permissions, and delete production. This report measures one thing: can the agent take a dangerous action without an accountable human approval anyone can later verify?

THE HEADLINE

We ran the fire drill across 12,000 servers in the public MCP registry. At least 10% advertise a high-risk capability — and in a tool-level sample, 90% of servers with dangerous tools require no receipt.

Two lenses, both honest. Registry-wide (12,000 servers): a scan of each server's advertised name + description — a conservative floor, since most servers don't name a dangerous verb in their blurb. Tool-level (10-server sample, 15 unguarded operations, mean score 10/100): the deeper look at what the tools actually expose. Neither is a live deployment scan or a vulnerability claim. Run npx @emilia-protocol/fire-drill on your stack to add a data point.

METHODOLOGY

What the fire drill measures.

For each operation in an MCP manifest, OpenAPI spec, or tool list, the scanner classifies it into a high-risk family and checks whether a dangerous one can execute without a receipt requirement. The Agent Action Firewall score is the share of dangerous operations that require a receipt; EG-1 passes only when that share is 100%.

Money movement

pay / payout / refund / transfer / payroll

Data destruction

delete / drop / truncate / purge (and any HTTP DELETE)

Production deploy

deploy / release / terraform apply / migrate

Permission change

IAM / role / grant / policy / RBAC

Bulk data export

export / dump / download / backup

Regulated override

override a claim, benefit, credit, or decision

Static assessment from the manifest/spec — like SSL Labs or npm audit. A passing fix is verified at runtime with EG-1 conformance.

Run the fire drill See the index