AI Explained

Plain explanations of trending AI concepts, with live visualizations.

Agent

MCP SEP-2468 — RFC 9207 iss parameter for OAuth mix-up defense — What does it mean?

SEP-2468 recommends MCP authorization servers include an iss parameter on auth responses, and requires clients to validate it string-equal against the recorded issuer — blocking OAuth mix-up across multi-IdP setups.

Agent

RecMem paper — Subconscious + recurrence-triggered agent memory — What does it mean?

RecMem encodes every agent interaction into a cheap embedding-only 'subconscious' vector store and only invokes the LLM to consolidate clusters whose density crosses a recurrence threshold — reportedly up to 87% fewer memory-construction tokens.

Agent

MSR delegation study — Cascading fidelity loss over 20 iterations — What does it mean?

Microsoft Research's delegation stress test runs 20 rounds of LLM-to-LLM document editing with constrained in-loop verification — strong frontier models lose 19–34% artifact fidelity by iteration 20.

Agent

MCP SEP-2577 — Three deprecations and a one-year migration window — What does it mean?

SEP-2577 deprecates three early MCP features — Roots, Sampling, Logging — and introduces a Deprecated lifecycle that keeps them functional for one year, then Removed.

Agent

Tool router paper — Contextual-bandit tool routing — What does it mean?

Reframes the choice between equivalent tool providers (two search APIs, two code executors, …) as a contextual bandit problem — the router learns answer quality per service cycle, not just lowest latency.

Agent

FutureSim benchmark — Harness-level agent eval vs single-shot QA — What does it mean?

FutureSim replays 3 months of real-world news article-by-article and asks agents to forecast events — the best frontier agent reaches only 25%, and many score worse Brier skill than not predicting at all.

Agent

CDD paper — Context-Driven Decomposition for RAG knowledge conflict — What does it mean?

When retrieved context disagrees with the model's parametric knowledge, standard RAG hits ~15% under misconception injection — CDD extracts each claim, runs an explicit conflict-resolution sub-prompt, and reaches 71.3% on temporal-shift cases.

Agent

MCP SEP-2663 lands Tasks extension — async task handles for long-running tool calls — What does it mean?

SEP-2663 lets an MCP server return a Task handle from tools/call; the client then drives it with tasks/get, tasks/update, and tasks/cancel — no blocked connections.

Agent

Is Grep All You Need? — Grep vs vector retrieval for agentic search — What does it mean?

An empirical study on 116 LongMemEval questions finds literal grep generally beats vector retrieval inside agent harnesses — and that harness design and tool-calling style dominate the retrieval algorithm choice.

Agent

AsyncFC paper — Symbolic futures in the decode stream — What does it mean?

AsyncFC inserts a typed placeholder when the model emits a tool call, so decoding keeps flowing while the tool runs — no retraining required.