AI Explained
Plain explanations of trending AI concepts, with live visualizations.
S-Agent: spatial tool-use makes an 8B agent rival GPT-5.4 on spatial reasoning — Spatio-temporal evidence accumulation — What does it mean?
S-Agent makes a VLM a planner directing tools to build one shared 3-D model of a scene — an 8B agent then rivals GPT-5.4.
Multi-LCB extends LiveCodeBench to 12 languages — Cross-language generalization gap — What does it mean?
Multi-LCB ports each LiveCodeBench task into 12 languages with one judge, so a score drop measures pure cross-language generalization.
GLM-5.2 becomes the top open-weights model — Active vs total parameters — What does it mean?
GLM-5.2 lists 744B total but 40B active parameters — two numbers that decode different costs: the memory you hold vs the compute you pay per token.
GateMem shows agent memory can't balance utility, access control, and forgetting — Memory governance trilemma — What does it mean?
GateMem benchmarks shared agent memory on utility, access control, and reliable forgetting at once — and finds no method passes all three.
ContextRL rewards evidence selection to boost agent and multimodal reasoning — Contrastive context-selection RL — What does it mean?
ContextRL rewards a model for picking which of two near-identical contexts supports the answer — sharpening fine-grained evidence grounding.
UFP4 fixes FP4 pretraining's shrinkage bias — E2M1 shrinkage bias — What does it mean?
E2M1's lopsided 4-bit bins round values toward zero — a shrinkage bias UFP4 fixes with a Hadamard transform + stochastic rounding.
Taylor-Calibrate cuts hybrid-attention distillation tokens 4.9–9.2× — Taylor-guided gate initialization — What does it mean?
Taylor-Calibrate presets a linear-attention student's gates from the teacher — hitting distillation targets with 4.9–9.2× fewer tokens.
FAPO auto-optimizes multi-step LLM pipelines, beating GEPA on 15 of 18 benchmarks — Failure-attribution-gated prompt optimization — What does it mean?
FAPO has Claude Code diagnose where an LLM pipeline fails, then make scoped prompt or chain edits — beating GEPA on 15 of 18 benchmarks.
AtomMem gives LLM agents memory built from atomic facts, SOTA on LoCoMo — Atomic-fact agent memory — What does it mean?
AtomMem distills an agent's long history into atomic facts, files them by event and time, and links them in an associative graph for retrieval.
LedgerAgent gives tool-calling agents a structured state ledger — Pre-tool-call policy validation — What does it mean?
LedgerAgent tracks an agent's task state in a separate ledger and checks domain policy against it before any irreversible tool call.
HydraHead fuses full and linear attention per head, not per layer — Head-axis attention hybridization — What does it mean?
HydraHead mixes full and linear attention head-by-head, keeping exact attention only for retrieval-critical heads — a 7:1 split matching a coarser 3:1.
EfficientRollout — Self-speculative decoding with quantized self-drafters — What does it mean?
EfficientRollout speeds RL rollouts by drafting with a quantized copy of the model itself — a self-drafter that tracks the evolving policy for free.











