Many people said GPU compute would become a commodity. The opposite happened — and a new category of "neoclouds" is now racing to build the physical backbone of the AI boom. Stephen Balaban, co-founder and CTO of Lambda, explains why the conventional wisdom was exactly wrong, why we're still massively underbuilding compute, and what it actually takes to stand up a gigawatt-scale AI factory: land, power, cooling, networking, and a financing stack most people have never heard of. We go deep on the physics of how energy becomes tokens, NVIDIA's real moat, why a 2023 GPU can lease for more today than the day it shipped, and Stephen's provocative vision of "neural software." Plus the wild Lambda origin story — from a facial recognition startup to a camera in a baseball cap to a near-billion-dollar cloud business. This is the state of AI compute in 2026, from inside one of the companies building it.
(00:00) — Cold open
(01:21) — Why GPU compute was never a commodity
(02:45) — The H100 price index and what it gets wrong
(04:02) — The real moat: technology or financing?
(05:57) — Winner-take-all, or room for many neoclouds?
(06:48) — Are we overbuilding or underbuilding AI compute?
(09:26) — What if AI gets 10x more compute-efficient?
(10:44) — The real bottleneck: land, power, and shell
(11:38) — The backlash against data centers — and the misinformation
(15:00) — Opening the hood: from photons to tokens
(17:11) — Extracting more value from the same chip
(19:26) — Frontier inference and distributed training, explained
(23:26) — What actually drives compute cost
(25:21) — Lambda's chip stack and the NVIDIA relationship
(26:17) — A multi-silicon world? CUDA, CUDNN, and NVIDIA's real moat
(28:59) — Networking, storage, and the one-click cluster
(34:46) — Renting vs. owning, and full vertical integration
(36:24) — How global is Lambda? Does location still matter?
(38:44) — The financing stack: off-take agreements, SPVs, and credit
(41:16) — Why a 2023 GPU leases for more today
(42:36) — A futures market for compute?
(43:54) — Origin story: facial recognition, Perceptio, and Apple
(47:03) — The Lambda hat and Dream Scope
(48:59) — The $60K bet that became a cloud business
(52:00) — Holding the team together through the hard times
(54:30) — Bringing on a new CEO; Stephen as CTO
(57:33) — Matching xAI on high-velocity deployment
(59:29) — "AI won't write software — it will become the software"
(01:01:30) — Neural software vs. vibe coding
(01:04:25) — Do agents change the compute layer?
(01:06:14) — Self-assembling software inside Lambda
(01:08:18) — Gigawatt-scale AI factories
(01:08:57) — One person, one GPU
(01:12:04) — Hot takes: overrated and underrated in AI