AI Explained

Plain explanations of trending AI concepts, with live visualizations.

Agent2026-06-22

S-Agent: spatial tool-use makes an 8B agent rival GPT-5.4 on spatial reasoning — Spatio-temporal evidence accumulation — What does it mean?

S-Agent makes a VLM a planner directing tools to build one shared 3-D model of a scene — an 8B agent then rivals GPT-5.4.

Agent2026-06-22

Multi-LCB extends LiveCodeBench to 12 languages — Cross-language generalization gap — What does it mean?

Multi-LCB ports each LiveCodeBench task into 12 languages with one judge, so a score drop measures pure cross-language generalization.

LLM2026-06-22

GLM-5.2 becomes the top open-weights model — Active vs total parameters — What does it mean?

GLM-5.2 lists 744B total but 40B active parameters — two numbers that decode different costs: the memory you hold vs the compute you pay per token.

Agent2026-06-22

GateMem shows agent memory can't balance utility, access control, and forgetting — Memory governance trilemma — What does it mean?

GateMem benchmarks shared agent memory on utility, access control, and reliable forgetting at once — and finds no method passes all three.

Agent2026-06-22

ContextRL rewards evidence selection to boost agent and multimodal reasoning — Contrastive context-selection RL — What does it mean?

ContextRL rewards a model for picking which of two near-identical contexts supports the answer — sharpening fine-grained evidence grounding.

GPU2026-06-21

UFP4 fixes FP4 pretraining's shrinkage bias — E2M1 shrinkage bias — What does it mean?

E2M1's lopsided 4-bit bins round values toward zero — a shrinkage bias UFP4 fixes with a Hadamard transform + stochastic rounding.

LLM2026-06-21

Taylor-Calibrate cuts hybrid-attention distillation tokens 4.9–9.2× — Taylor-guided gate initialization — What does it mean?

Taylor-Calibrate presets a linear-attention student's gates from the teacher — hitting distillation targets with 4.9–9.2× fewer tokens.

Agent2026-06-21

FAPO auto-optimizes multi-step LLM pipelines, beating GEPA on 15 of 18 benchmarks — Failure-attribution-gated prompt optimization — What does it mean?

FAPO has Claude Code diagnose where an LLM pipeline fails, then make scoped prompt or chain edits — beating GEPA on 15 of 18 benchmarks.

Agent2026-06-21

AtomMem gives LLM agents memory built from atomic facts, SOTA on LoCoMo — Atomic-fact agent memory — What does it mean?

AtomMem distills an agent's long history into atomic facts, files them by event and time, and links them in an associative graph for retrieval.

Agent2026-06-20

LedgerAgent gives tool-calling agents a structured state ledger — Pre-tool-call policy validation — What does it mean?

LedgerAgent tracks an agent's task state in a separate ledger and checks domain policy against it before any irreversible tool call.

LLM2026-06-20

HydraHead fuses full and linear attention per head, not per layer — Head-axis attention hybridization — What does it mean?

HydraHead mixes full and linear attention head-by-head, keeping exact attention only for retrieval-critical heads — a 7:1 split matching a coarser 3:1.

LLM2026-06-20

EfficientRollout — Self-speculative decoding with quantized self-drafters — What does it mean?

EfficientRollout speeds RL rollouts by drafting with a quantized copy of the model itself — a self-drafter that tracks the evolving policy for free.