Jensen Huang on Why AI Companies Win by Co-Designing the Whole AI Factory
Jensen Huang reasons from first principles that AI has turned the computer from a retrieval warehouse into a token-generating factory, so NVIDIA now co-designs the entire AI factory instead of just the GPU, riding four compounding scaling laws toward a future he treats as already inevitable.
Extreme Co-Design: The GPU Is No Longer the Bottleneck
Speeding up only the GPU barely helps: Amdahl's Law means a distributed AI job stays capped by everything you did not accelerate, so NVIDIA co-designs the entire stack — CPU, networking, storage, power, cooling — at once.
if computation represents 50% of the problem, and I sped up computation infinitely like a million times, you know, I only sped up the total workload by a factor of two.
The $1.5B Bet: Install Base Defines an Architecture
NVIDIA bet the company on CUDA — shipping it on every GeForce raised costs about 50% and cratered the market cap from roughly $8B to $1.5B, on the conviction that an architecture is worth nothing without an install base.
After we launched CUDA, I recognized that it was going to add so much cost, but it was something we believed in. You know, our market cap went down to like one and a half billion dollars.
Manifesting the Future by Shaping Belief
Jensen manifests the future by shaping belief long before he announces anything — dropping the same idea across board, employees, and partners for years, so the reveal lands as 'what took you so long?' instead of 'this is insane.'
You manifest a future and that future is so convincing, there's no way it won't happen. There's a lot of suffering in between, but you've gotta believe what you believe.
Speed of Light, Not Continuous Improvement
Design from the speed of light, not yesterday's number — rather than trimming a 74-day process to 72, Jensen strips it to zero, asks what physics allows (maybe six days), then reasons back so every remaining day is a justified trade-off.
And then now that you know that six days is possible, then the conversation from 74 to six, surprisingly much more effective.
Four Scaling Laws — and Inference Is Thinking
Inference is thinking, and thinking is expensive — Jensen's four scaling laws (pre-training, post-training, test-time, and agentic) all compound on one axis, compute, which is why the 'inference will be cheap and commoditized' prediction was always wrong.
inference is thinking, and I think thinking is hard. Thinking is way harder than reading.
The Computer Became a Factory
The computer stopped being a warehouse and became a factory — where retrieval systems fetched pre-recorded files, generative AI now manufactures tokens in real time, turning compute from a storage cost into a revenue-generating production line.
computers, because it was a storage system, it was largely a warehouse. We're now building factories. Warehouses don't make much money. Factories directly correlates with the company's revenues.
Your Job Is a Purpose, Not a Task
Your job is a purpose, not a task — AI hit superhuman radiology around 2020 yet the number of radiologists grew into a shortage, because automating the task (reading scans) expanded the purpose (diagnosing disease) rather than erasing it.
the purpose of your job and the tasks and tools that you use to do your job are related, not the same.
Intelligence Is a Commodity; Humanity Is Not
Intelligence is becoming a commodity while humanity is the scarce resource — Jensen, who calls himself lower on the intelligence curve than his 60 experts, argues that character, compassion, and determination are the superhuman traits no chip will replicate.
I actually think intelligence is a commodity. I'm surrounded by intelligent people.