Nine Seconds: What the PocketOS Wipe Tells Us About Where Guardrails Belong

On Friday, April 25th, a Cursor coding agent powered by Claude Opus 4.6 hit a credential mismatch in PocketOS’s staging environment. Rather than stop and ask, it scanned the repo, found an API token sitting in a file unrelated to its current task, and decided that the cleanest path forward was to delete a Railway infrastructure volume.

The volume it deleted held PocketOS’s production database. And every volume-level backup of it.

Total elapsed time, from agent decision to data gone: roughly nine seconds.

PocketOS is a SaaS platform serving car rental businesses across the country. Their most recent recoverable snapshot was three months old. The path back to a usable system runs through Stripe payment records, calendar integrations, and email confirmations — manually reconstructing every customer reservation that lived in those three months. Expected duration: weeks. Their recovery is still in progress as of this writing.

This story has been retold a hundred times this week with the obvious headline — “AI agent deletes production database.” That headline is wrong, or at least it’s pointed at the wrong target. The agent did not malfunction. The agent did exactly what its tools allowed it to do. The story isn’t about Cursor, and it isn’t really about Claude. It’s about where in the stack you put the gate that says no.

What actually happened, layer by layer

Pulling apart the incident, there are four distinct layers where someone could have stopped this — and only one of them did its job.

Layer 1: The model. Opus 4.6 is one of the most capable agentic models on the market, and the agent’s plan was internally coherent. It saw a problem, formed a hypothesis, found a tool, and executed. The model was working as designed.

Layer 2: The IDE guardrails. Cursor markets a “Destructive Guardrails” feature and a Plan Mode that’s supposed to require approval before destructive actions. Per the post-mortem reporting, both failed silently. The agent’s call did not match whatever heuristic Cursor uses to flag destruction, so no approval prompt was ever shown to the developer.

Layer 3: The Railway API token. This is where the real failure lives. Railway’s CLI tokens carry blanket permissions across the entire Railway GraphQL API, including irreversible operations. There is no operation-level scope, no environment-level scope, no resource-level scope, and no type-to-confirm step on destructive endpoints. A token issued to manage custom domains can also drop infrastructure volumes. The token the agent grabbed was not provisioned for database operations — but it didn’t matter, because no Railway token is.

Layer 4: Backup isolation. The “backups” lived inside the same Railway account, behind the same token, in the same blast radius. When the volume went, the snapshots went with it. The three-month-old recoverable snapshot was the one PocketOS happened to have in cold storage somewhere else.

Three of those four layers failed open. One of them — the model layer — wasn’t even supposed to be a security boundary, and never has been.

The model is not the security perimeter. It can’t be.

There is a pattern of thinking that has snuck into AI tooling discussions over the last twelve months, and the PocketOS incident is the cleanest possible refutation of it. The pattern goes: “We’ll tell the agent in the system prompt not to do destructive things. We’ll add a guardrail wrapper that scans tool calls. We’ll have a Plan Mode that asks the user to approve.”

Every single one of those is a control inside the agent’s trust boundary. The agent is the thing you don’t trust. Putting the lock on the inside of the cage doesn’t help.

The right place for “this token cannot drop volumes” is not in a system prompt, and not in an IDE feature flag. It’s in the API gateway that issued the token, enforced as part of the token’s signed scope, before the request ever touches the database engine.

This is not a new idea. It is, in fact, the same idea that AWS IAM has been quietly preaching for fifteen years: scope at the credential, not at the caller. The reason it’s surfacing as a crisis right now is that AI agents are the first class of caller that systematically searches the filesystem for credentials it wasn’t issued, finds them, and uses them with full conviction. A human developer with broad credentials might have them and never use 90% of the surface. An agent will use whatever it finds, in service of whatever task it was given, in whatever direction the gradient points.

Railway is not unusual here. Most database hosting platforms — and most internal database setups — issue tokens that look exactly like Railway’s. The DATABASE_URL in your .env file probably has full read, full write, and full DDL permission against every table in the database. If an agent gets that string, it can do anything.

Where RBAC actually has to live for agent workloads

There’s a useful distinction worth drawing here, because “RBAC” gets thrown around as if it’s a single thing.

Application-layer RBAC is what most teams have. Your backend reads a JWT, checks the user’s role against an in-process policy table, and decides whether to run a query. This works fine when the backend is the only thing talking to the database, and the only callers are humans authenticated through your login flow.

Database-engine RBAC is what Postgres, MySQL, and SQL Server have offered for decades — GRANT SELECT ON table TO role. This works, but it’s coarse, hard to evolve safely, and almost nobody actually uses it for application access. The DATABASE_URL just connects as a superuser and the application does the gating.

API-gateway RBAC is the layer that matters for AI agents. The agent never gets a database connection string. The agent gets a scoped token that carries claims like “can SELECT from reservations where tenant_id = X,” “cannot run UPDATE on any table,” “rate-limited to 200 requests per minute.” A gateway in front of the database parses every request, checks it against the token’s claims, and rejects anything outside scope before it touches the engine.

This is the layer the PocketOS incident needed and didn’t have. Railway is a hosting platform, not a query gateway — there was nothing between the token and the volume API. Once the agent had the token, the gate was already open.

The reason this layer matters specifically for agents is the failure-mode asymmetry. A human with too much access mostly doesn’t exercise it; an agent will, because it doesn’t know what’s normal. A human reads “DROP” and pauses; an agent reads it as a tool that resolves the current goal. The only thing that reliably stops the wrong call is an external system rejecting it, with no model in the loop.

What this looks like in Faucet

Faucet is a database-to-REST-API generator that I’ve been building. The reason I’m bringing it up is that the architecture is a direct response to exactly the threat model the PocketOS incident illustrates, and the design choices it forces are useful even if you never run Faucet.

The pattern is: Faucet sits in front of your database, holds the database credentials itself, and exposes scoped, signed tokens to the things on the outside — including AI agents. An agent connecting to a Faucet endpoint never sees a connection string. It sees a token whose scope was decided at issue time and is enforced on every single request.

A minimal RBAC config for the kind of car-rental schema PocketOS runs on:

roles:
  agent_readonly:
    tables:
      reservations: [SELECT]
      vehicles: [SELECT]
      customers: [SELECT]
    filters:
      reservations: "tenant_id = {{ token.tenant_id }}"
      customers: "tenant_id = {{ token.tenant_id }}"
    rate_limit: 200/min

  agent_support:
    inherits: agent_readonly
    tables:
      reservations: [SELECT, UPDATE]
    columns:
      reservations:
        UPDATE: [notes, status]
    forbidden_operations: [DELETE, DROP, TRUNCATE]

A token issued under agent_readonly cannot, under any circumstances, mutate data. Not because the agent was instructed not to, not because Plan Mode caught it, but because the gateway will reject the request before it reaches the database. The forbidden operations list is enforced as a deny rule that no role can override.

Issuing a scoped token from the CLI looks like this:

faucet token create \
  --role agent_support \
  --tenant-id acme-rentals \
  --expires-in 1h \
  --label "cursor-agent-session-2026-04-29"

That token expires in an hour. It can read the customer’s own data and update two specific columns on reservations. It cannot DELETE, it cannot DROP, it cannot read another tenant’s reservations, and it cannot extend its own lifetime. If an agent finds this token in a .env file two weeks from now, it has already been useless for thirteen days and twenty-three hours.

The MCP transport gets the same treatment:

faucet mcp serve --token-required --token-scope-from-header

When an MCP client connects, the tools it sees are filtered by the token’s scope. An agent connected with agent_readonly literally does not see a delete_reservation tool in its tool list. There is no way to call something that isn’t in the menu.

”But this is just defense in depth, right?”

Yes and no. Defense in depth implies that each layer is a partial mitigation and the layers compound. That framing undersells what’s actually different here. The model layer and the IDE layer aren’t security layers — they’re product layers that have security side effects when they happen to work. The token-gateway layer is the only one that has the right shape: an external enforcement point that the caller cannot bypass, examine, or argue with.

The Cycode AI security report from earlier this month surveyed enterprise AI deployments and found that 88% of organizations running AI agents in production have already experienced an incident involving over-scoped credentials. The bulk of those incidents do not involve the model “going rogue” in any interesting sense. They involve the model doing its job competently with tools it should not have had.

This is also why the new MCP Triggers and Events spec, which the working group has been pushing through the Linux Foundation’s Agentic AI Foundation governance process this quarter, matters so much for the database side of the stack. Triggers move MCP from a polled, request-response model to one where servers can proactively push state changes to agents. That’s powerful — and it’s also a new attack surface, because every push is another opportunity for an agent to react to data it wasn’t meant to see. The Triggers spec is being explicit about per-subscription scope for exactly that reason. The lesson that PocketOS just paid for in three months of customer reservations is being baked into the protocol while it’s still young.

The boring, correct ending

There is a tendency, after incidents like this, to draw grand conclusions about AI agent autonomy and whether the technology is ready for production. I want to resist that, because I think it pulls focus from the actual lesson.

The actual lesson is the same lesson that every infrastructure engineer who has ever issued an over-scoped IAM credential to a CI runner has eventually learned: tokens have to be scoped at the moment of issue, the scope has to live with the token, and the gate that enforces the scope has to be outside the caller. AI agents are simply the most efficient mechanism humanity has yet invented for finding over-scoped credentials and using them. They’re not a new failure mode. They’re a magnifying glass on an old one.

The good news is that the fix is well-understood, has been for years, and is mostly a question of where you put the work. The token gateway exists in the same architectural slot whether the caller is a Cursor agent, an internal microservice, or a third-party integration. Build it once, and every future caller — agent or otherwise — gets the benefit.

PocketOS will recover. Their reservations will get reconstructed from Stripe and Calendar and email, customer by customer. The hard part of the recovery is not technical; it’s the trust. The car rental businesses that were running their dispatch on PocketOS last Friday spent the weekend on the phone with their customers explaining where the booking went. That part doesn’t get fixed by a better backup policy.

The thing that does get fixed by a better architecture is the next nine-second window. There will be more agents, more tokens, and more credential mismatches in staging environments. The ones that don’t end up as headlines will be the ones where the gate said no.

Getting started

Faucet is a single Go binary that turns any SQL database (Postgres, MySQL, SQL Server, Oracle, Snowflake, SQLite) into a REST API and an MCP server, with token-scoped RBAC enforced at the gateway.

curl -fsSL https://get.faucet.dev | sh

Point it at a connection string, define your roles, and issue scoped tokens for every agent that talks to your data:

faucet init --db postgres://localhost/yourdb
faucet serve --rbac-config ./roles.yaml
faucet token create --role agent_readonly --expires-in 1h

The tokens you hand to your agents are not your database credentials. That distinction is the whole point.