MCP SEP-2663 — async task handles for long tool calls
AgentThe news. On May 15, 2026, MCP SEP-2663 was merged — replacing the experimental "tasks" feature with a cleaner async-handle model. Servers can now respond to a tool call with a Task object instead of a blocking result; the response carries a polymorphic
resultType: "task"plus a handle the client uses to poll, drive, or cancel. The redesign drops client-hosted tasks to comply with SEP-2260's rule that servers cannot make unsolicited requests.
Picture the valet stand. The slow path is standing next to it the entire time your car is parked. You hand over the keys, the valet drives off, and you wait at the curb. The line behind you forms. Nothing else happens. If you have to leave — a phone call, your colleague calling you upstairs — you lose your spot, and there is no clean way to come back. That is exactly what a synchronous tools/call looks like in MCP today. The agent's tool-use loop emits the call, the server holds the connection until the tool finishes, and the client process is pinned in place for the full duration. The expensive accelerator in your stack is doing real work; the slim client process babysitting it is doing nothing.
The fast path is walk away with a claim ticket. You hand the valet your keys, they hand you back a small tag with a number on it, and you go on with your day. The claim ticket is the Task handle. It is not the car. It stands in for the car. You can hold it, share it ("I left my keys with the guy at the front, here's the ticket"), or come back later to pick up. You can ask "is my car ready yet?" any number of times without standing at the stand. And if the valet needs to check something — "do you want it pulled up to the front or the back?" — they can ask you when you next walk past, instead of holding the entire transaction open waiting for an answer.
The server encodes this with a polymorphic response — conceptually { resultType: "task", taskId: "abc123", status: "running", ... }, with the spec also including timestamps and a suggested poll interval — and starts the actual work in the background. The client gets the handle and is free. Three new JSON-RPC methods anchor the extension: tasks/get polls the status (running, requires input, completed, failed, cancelled) and, when complete, returns the final result. tasks/update sends responses to any input requests the server raised mid-flight — a server-driven question handled out of band. tasks/cancel is a fire-and-forget abort: eventually consistent, the server stops at its next safe point.
The catch — and this is where the old design got rewritten — is direction. The previous experimental task flow leaned on server-initiated events to the client. That feels natural until you remember that MCP's SEP-2260 constrains servers from making unsolicited requests on transports that don't support reverse calls — short-lived HTTP, server-to-server bridges, simple stdio pipes. SEP-2663 flips the primary direction: by default the server only responds to client-initiated calls (tools/call, tasks/get, tasks/update, tasks/cancel), and the client owns the timing. For clients that want push-style updates, an optional subscription flow (subscriptions/listen + notifications/tasks) is available, but the client must opt in. That keeps the protocol legal across every transport and makes the task lifecycle reasonable for a crash-tolerant agent harness — if the client dies and restarts, the same taskId can be polled again.
Consider a concrete pinning cost. Say a single tool — a research run that takes 4 minutes — pins one client connection for the full duration in the sync world. With 200 concurrent agent sessions each running 6 such tools per session, that's 200 × 6 = 1,200 simultaneously pinned connections the harness must keep alive (illustrative). Under SEP-2663, each tool consumes one round-trip to receive a handle (~200ms), then nothing — the connection returns to the pool. The same workload now needs roughly 1,200 × (200ms / 240,000ms) ≈ 1 pinned connection on average (illustrative) — a ~1000× reduction in held sockets at any moment, before counting any of the polling overhead.
Where the design earns its keep
Multiply hundreds of concurrent agent sessions — each running multiple long-duration tools — and the connection count alone becomes a capacity problem before the actual work does. That is the argument the Agent Engineering Cost & Latency module makes for parallel tool fan-outs: the harness stalls not because compute is exhausted but because held sockets exhaust first. Task handles cut that link: one tool, one handle, zero pinned connections.
The shape of what SEP-2663 actually adds is concrete. The table below contrasts the legacy and new flows.
| Aspect | Sync tools/call | Tasks extension (SEP-2663) |
|---|---|---|
| Response shape | Final result | Polymorphic: immediate result OR a Task handle |
| Client blocked time | Full tool duration | One round-trip to receive the handle |
| Resumable across reconnect | No protocol-level resume path | Yes — the same taskId can be polled again |
| Cancellation | Close the connection, hope for the best | tasks/cancel (eventually consistent) |
| Mid-run input from server | Not supported | tasks/update responds to server-raised requests |
| Server-pushed events | N/A | Client-driven polling by default; optional subscriptions/listen for opt-in push |
There is a related idea worth knowing. The AsyncFC paper ships a structurally similar pattern one layer up: at the model side, the decoder treats a not-yet-resolved tool call as a typed placeholder and keeps generating tokens. SEP-2663 is what that placeholder is a placeholder for — the server-side handle the harness holds while the model decodes past it. Both pieces want the same thing: stop letting tool latency dictate the cost-and-latency profile of an agent.
The boundary of what SEP-2663 changes is not "how tools work." Existing fast tools keep returning their immediate result and the legacy path is untouched. The boundary is "what happens when tools take a while." For those — and for any tool that wants to support cancellation, mid-flight clarification, or survival across client reconnects — the Task handle becomes the right shape.
Goes deeper in: Agent Engineering → Harness Architecture → Checkpoints & Resumption