AI Explained
Plain explanations of trending AI concepts, with live visualizations.
IBM Granite 4.1 — 8B dense matches the prior 32B MoE — What does it mean?
Granite 4.1 8B dense matches the prior 32B-A9B MoE on tool calling — and fits in roughly one quarter of the GPU memory.
GLM-5V-Turbo — native multimodal vs text-first vision-bolted designs — What does it mean?
GLM-5V-Turbo trains text + vision + tool data jointly from step 1, vs the LLaVA-style default that bolts a frozen ViT onto a text-only LLM.
DeepSeek V4-Pro and V4-Flash — long-context cost cut to a fraction — What does it mean?
V4-Pro and V4-Flash drop both per-token FLOPs and KV cache to roughly 7-27% of V3.2 at 1M context — same cluster, ~10-14× more concurrent users.
CoPD paper — Reinforcement Learning with Verifiable Rewards (RLVR) — What does it mean?
RLVR is post-training where a tiny verifier — unit tests, equality checks, proof assistant — replaces the learned reward model. Reward = 0 or 1, depending on whether the answer checks out.
CoPD paper — Co-evolving Policy Distillation between parallel experts — What does it mean?
CoPD trains N specialist LLMs in parallel as mutual teachers — every step distills on-policy from current peers, beating RLVR + frozen-expert OPD.