Understand AI systems by seeing them work

Interactive visual simulations from GPU architecture to running an agent fleet in production. No GPU required. Free.

Live Preview: Self-Attention
Thecatsatonthemat
The
cat
sat
on
the
mat
The
0.35
0.07
0.05
0.10
0.35
0.08
cat
0.06
0.26
0.47
0.06
0.07
0.08
sat
0.05
0.52
0.21
0.06
0.05
0.11
on
0.07
0.05
0.08
0.20
0.08
0.52
the
0.33
0.07
0.05
0.10
0.38
0.07
mat
0.04
0.07
0.10
0.42
0.05
0.32

Hover over tokens to explore attention patterns

The Full AI Stack

5 tracks that take you from GPU hardware to running an agent fleet in production. Start anywhere — each track stands on its own.

AI Explained

Plain-language takes on trending AI concepts, with live simulations.

GPU & CUDA

9 interactive modules — 9 interactive modules from GPU execution model to Triton & torch.compile. All free.

LLM Internals

9 interactive modules — 9 interactive modules from tokenization to PagedAttention. All free.

LLM Serving

7 interactive modules — 7 interactive modules from inference-engine internals to prefix caching. All free.

AI Agents

9 interactive modules — 9 interactive modules. Free, browser-based.

Agent Engineering

9 interactive modules — 9 interactive modules. Production-ready agent engineering — picks up where AI Agents Foundations ends. Free, browser-based.