On April 8, Anthropic shipped Claude Managed Agents into public beta. Notion, Rakuten, and Asana were the named early adopters. The headline that engineers fixated on was the new pricing axis: $0.08 per session-hour, billed to the millisecond, with idle time free. Tokens still meter on top.
Read that pricing carefully. Anthropic has separated thinking (tokens, priced as before) from running (session-hours, new). When the agent sits paused waiting for a human approval or a queued downstream job, the meter stops. When it is actively executing a tool — say, waiting on a database query — the meter ticks.
This is the first commercial agent infrastructure where slow data access is a directly billable line item. The pricing template is going to set the floor for the rest of the industry within two quarters.
If you are shipping agents that touch a database — and at this point, most of them do — the economics of agent design have shifted. Below is what changed and how to adapt.
What “Session-Hour” Actually Bills
A session is one long-running agent execution. Anthropic’s runtime owns the sandbox, the file system, the in-progress checkpoints, and the tool call orchestration. You hand them a prompt and a tool spec; they keep the lights on for as long as the agent needs to work.
The breakdown:
- Tokens: standard rates. For Opus 4.6, $5/M input and $25/M output.
- Session runtime: $0.08/hour, ms-billed, only when “running.”
- Code execution: rolled into session runtime — no separate container-hour charge anymore.
“Running” includes anything the runtime is actively driving: model inference, code interpreter sandbox, MCP tool calls, file I/O. “Idle” is when the agent is paused waiting for the user, an approval, or a queued downstream task.
The implication is direct: a tool call that takes 40 seconds because of a slow JOIN is 40 seconds of metered runtime. Multiply by thousands of agent runs per day across an enterprise and the math gets uncomfortable fast.
The Bottleneck Is Almost Always the Database
The April 14 TianPan post “Why Your Database Melts When AI Features Ship” went around for a reason. The breakdown: when you bolt an LLM onto an existing application database, you import a workload pattern the database was never tuned for — high-concurrency, long-tail-latency, bursty fan-out reads from agents asking the same kinds of questions over and over.
The pre-agent architecture assumed:
- A single user request acquires a connection.
- Maybe one or two queries fire.
- The connection releases in milliseconds.
The agent architecture inverts every assumption:
- A single agent run might fire 20+ tool calls.
- Each tool call may ask for joined, filtered, paginated data.
- The agent holds context across calls; the database does not.
Without an API layer, your agent ends up either embedding a SQL string in a tool call or asking a junior MCP server to run queries it generates on the fly. Both patterns are slow, both are dangerous, and now both are billed at $0.08 per session-hour while they are running.
The architectural fix that LangChain, Databricks, and just about every responsible agent framework shipped in the last six months is the same: never hold a database connection while waiting on the LLM. Query, release, think, query again. That is only possible when there is a fast, well-defined API in front of the database. Generated SQL hitting the raw database with a held connection is the path of least resistance and the path most likely to ruin your weekend.
Databricks Already Sees the Bill Coming
A week after Anthropic’s launch, Databricks announced Unity AI Gateway with MCP governance support on April 15. Read that announcement alongside the new agent pricing model and the message becomes clear: every external MCP server now needs a registered identity, on-behalf-of permissions, and a centralized audit table. Every agent that calls an MCP server on behalf of a user goes through the gateway.
That is good for security. It is also another network hop between the agent and the data. Each governance hop is also billed runtime.
Salesforce Headless 360 at TDX on April 16 and Google’s Apigee MCP bridge at Cloud Next on April 22-24 push the same direction: every enterprise platform is exposing itself as MCP tools. Every agent with a job to do now has an order-of-magnitude more places to call. With session-hour pricing, choosing a fast, low-latency tool over a slow, multi-hop one is now a unit-economics decision, not a developer-experience preference.
The Cost Math, Concretely
Suppose your agent runs 1,000 sessions per day. Average session length: 8 minutes. Of those 8 minutes, 2 minutes are spent inside tool calls. The rest is model thinking, paused approvals, and downstream queue waits.
| Tool latency | Tool time per session | Daily session-runtime cost (1k runs) | Annual cost |
|---|---|---|---|
| 50 ms (cached API) | 5 sec | $1.07 | $390 |
| 800 ms (raw SQL) | 80 sec | $17.07 | $6,230 |
| 2.5 sec (slow JOIN) | 250 sec | $53.33 | $19,470 |
Those numbers do not include token costs or the cost of the model thinking between tool calls. They are just the database access tax.
This is the part that flips the standard “developer time vs. infrastructure cost” tradeoff. Until April 8, slow database queries were a developer-experience problem. After April 8, they show up on the corporate Anthropic invoice every month.
Faucet, Quickly
Faucet is a single Go binary that points at a database and exposes it as REST endpoints, an OpenAPI 3.1 spec, and an MCP server, all on the same port. It runs on PostgreSQL, MySQL, SQL Server, Oracle, Snowflake, and SQLite. No code generation, no schemas to maintain, no separate API service to deploy.
Two reasons this matters for the new agent pricing:
Connection pooling, by default. Faucet 0.1.9 set sane defaults at 100 max connections and 25 max idle. Your agent does not open and close a fresh TCP connection for every tool call; the binary holds the pool. When a thousand parallel agent tool calls arrive, they fan out across a healthy pool instead of stampeding the database with new connection handshakes.
Tools matched to schema. When Faucet exposes a database as MCP, every table becomes a typed tool with concrete column types, primary key lookups, and filter operators (including the boolean filter fix shipped in v0.1.12). The agent is not generating SQL strings hoping the column exists — it is calling a tool that already knows the schema. That cuts both latency and the retry loop when the LLM hallucinates a column name.
A 60-second example. Suppose you have a Postgres warehouse and want to expose orders, customers, and products to an agent.
curl -fsSL https://get.faucet.dev | sh
faucet configure \
--db postgres://user:pass@host:5432/warehouse \
--tables orders,customers,products
faucet serve
That is a REST API at http://localhost:8080/api/orders, an OpenAPI spec at /openapi.json, and an MCP endpoint at /mcp.
To wire it into Claude Code:
claude mcp add faucet http://localhost:8080/mcp
The agent now sees list_orders, get_orders_by_id, filter_customers_by_email, and per-column filter tools — typed, fast, pooled, audited. Median latency for a single-row lookup against a warm pool is under 30 ms. That is the difference between $390/year and $19,000/year of session runtime in the table above.
For the agent itself, the difference is even more direct. Compare the two patterns:
# Anti-pattern: generic run_sql tool
agent.tool("run_sql", "SELECT o.*, c.email FROM orders o JOIN customers c "
"ON c.id = o.customer_id WHERE o.status = 'pending' "
"ORDER BY o.created_at DESC LIMIT 50")
# Latency: 800ms - 2.5s. Risk: full table scan.
# Retry on hallucinated column name. Held connection during LLM thinking.
# Faucet pattern: typed tool against pooled pre-defined endpoint
agent.tool("filter_orders",
status="pending",
order_by="-created_at",
limit=50,
expand=["customer.email"])
# Latency: ~30ms. No retry. Connection released to pool immediately.
Same data. Different invoice.
The Anti-Pattern To Avoid
A pattern shipping right now that will get more expensive every month: a thin MCP server that exposes a single run_sql tool against a production database.
Three problems compound:
- Generated SQL is slow. LLMs default to over-broad
SELECT *queries with no useful indexes. - Generated SQL is risky. Even read-only roles can
SELECT * FROM ordersand pull 50 GB through the network. - Generated SQL is metered now. Every retry loop where the LLM rewrites the query costs session runtime.
The fix is not to ban SQL — it is to put a typed, schema-aware boundary between the agent and the database. That is what API-first architecture has been about for fifteen years; it is now also what session-hour pricing demands.
What Changes Next
A few predictions, since the pricing template will spread:
- OpenAI and Google will follow. They have similar agent runtimes (OpenAI Agents SDK, Vertex AI Agent Engine, now folded into the Gemini Enterprise Agent Platform). Once one major lab prices runtime as a metered axis, the rest follow within two quarters. By Q3 2026, expect every hosted agent platform to price runtime separately from tokens.
- “Agent latency budget” becomes a real engineering concept. Today, p99 latency is something the SRE team worries about. With session-hour pricing, it shows up on the same dashboard as cloud spend, and platform engineering teams will start defending it line-by-line.
- MCP server quality becomes economically observable. A poorly written MCP server that takes 4 seconds to return a simple lookup will lose to a fast one — not because anyone ran a benchmark, but because procurement noticed the bill.
- Database-to-API tooling consolidates. Hand-rolled REST APIs, ORMs with too many abstractions, and “just write some endpoints” middleware all start losing to instant generation. The 6-week internal API project that gets parked never starts to look as cheap when slow data is on the invoice.
The broader shift: for the first decade of LLMs, the model’s thinking dominated the cost. Now the bridge between the model and the rest of your stack is on the bill. The bridge has always been database access, more often than not. It is now metered.
Getting Started
Faucet takes about 60 seconds to install and another 60 to point at a database.
curl -fsSL https://get.faucet.dev | sh
faucet --version
faucet configure --db <your-db-url>
faucet serve
The MCP endpoint is at /mcp. The OpenAPI spec is at /openapi.json. The Web UI is at /admin. RBAC, audit logs, and connection pooling are on by default. If you are running multiple databases, point Faucet at all of them — the same binary handles PostgreSQL, MySQL, SQL Server, Oracle, Snowflake, and SQLite.
If you are shipping agents in the back half of 2026, the “thin tool layer in front of the database” question is no longer an architecture preference. It is now in the contract you signed with Anthropic.