AI Explained

Plain explanations of trending AI concepts, with live visualizations.

Agent

S-Agent: spatial tool-use makes an 8B agent rival GPT-5.4 on spatial reasoning — Spatio-temporal evidence accumulation — What does it mean?

S-Agent makes a VLM a planner directing tools to build one shared 3-D model of a scene — an 8B agent then rivals GPT-5.4.

Agent

Multi-LCB extends LiveCodeBench to 12 languages — Cross-language generalization gap — What does it mean?

Multi-LCB ports each LiveCodeBench task into 12 languages with one judge, so a score drop measures pure cross-language generalization.

LLM

GLM-5.2 becomes the top open-weights model — Active vs total parameters — What does it mean?

GLM-5.2 lists 744B total but 40B active parameters — two numbers that decode different costs: the memory you hold vs the compute you pay per token.

Agent

GateMem shows agent memory can't balance utility, access control, and forgetting — Memory governance trilemma — What does it mean?

GateMem benchmarks shared agent memory on utility, access control, and reliable forgetting at once — and finds no method passes all three.

Agent

ContextRL rewards evidence selection to boost agent and multimodal reasoning — Contrastive context-selection RL — What does it mean?

ContextRL rewards a model for picking which of two near-identical contexts supports the answer — sharpening fine-grained evidence grounding.

GPU

UFP4 fixes FP4 pretraining's shrinkage bias — E2M1 shrinkage bias — What does it mean?

E2M1's lopsided 4-bit bins round values toward zero — a shrinkage bias UFP4 fixes with a Hadamard transform + stochastic rounding.

LLM

Taylor-Calibrate cuts hybrid-attention distillation tokens 4.9–9.2× — Taylor-guided gate initialization — What does it mean?

Taylor-Calibrate presets a linear-attention student's gates from the teacher — hitting distillation targets with 4.9–9.2× fewer tokens.

Agent

FAPO auto-optimizes multi-step LLM pipelines, beating GEPA on 15 of 18 benchmarks — Failure-attribution-gated prompt optimization — What does it mean?

FAPO has Claude Code diagnose where an LLM pipeline fails, then make scoped prompt or chain edits — beating GEPA on 15 of 18 benchmarks.

Agent

AtomMem gives LLM agents memory built from atomic facts, SOTA on LoCoMo — Atomic-fact agent memory — What does it mean?

AtomMem distills an agent's long history into atomic facts, files them by event and time, and links them in an associative graph for retrieval.

Agent

LedgerAgent gives tool-calling agents a structured state ledger — Pre-tool-call policy validation — What does it mean?

LedgerAgent tracks an agent's task state in a separate ledger and checks domain policy against it before any irreversible tool call.

LLM

HydraHead fuses full and linear attention per head, not per layer — Head-axis attention hybridization — What does it mean?

HydraHead mixes full and linear attention head-by-head, keeping exact attention only for retrieval-critical heads — a 7:1 split matching a coarser 3:1.

LLM

EfficientRollout — Self-speculative decoding with quantized self-drafters — What does it mean?

EfficientRollout speeds RL rollouts by drafting with a quantized copy of the model itself — a self-drafter that tracks the evolving policy for free.