Latent Space: The AI Engineer Podcast

Available Episodes

5 of 147

Long Live Context Engineering - with Jeff Huber of Chroma
Jeff Huber of Chroma joins us to talk about what actually matters in vector databases in 2025, why “modern search for AI” is different, and how to ship systems that don’t rot as context grows. Full show notes: https://www.latent.space/p/chroma 00:00 Introductions 00:48 Why Build Chroma 02:55 Information Retrieval vs. Search 04:29 Staying Focused in a Competitive AI Market 08:08 Building Chroma Cloud 12:15 Context Engineering and the Problems with RAG 16:11 Context Rot 21:49 Prioritizing Context Quality 27:02 Code Indexing and Retrieval Strategies 32:04 Chunk Rewriting and Query Optimization for Code 34:07 Transformer Architecture Evolution and Retrieval Systems 38:06 Memory as a Benefit of Context Engineering 40:13 Structuring AI Memory and Offline Compaction 45:46 Lessons from Previous Startups and Building with Purpose 47:32 Religion and Values in Silicon Valley 50:18 Company Culture, Design, and Brand Consistency 52:36 Hiring at Chroma: Designers, Researchers, and Engineers
--------
--------
Greg Brockman on OpenAI's Road to AGI
Greg Brockman, co-founder and president of OpenAI, joins us to talk about GPT-5 and GPT-OSS, the future of software engineering, why reinforcement learning is still scaling, and how OpenAI is planning to get to AGI. 00:00 Introductions 01:04 The Evolution of Reasoning at OpenAI 04:01 Online vs Offline Learning in Language Models 06:44 Sample Efficiency and Human Curation in Reinforcement Learning 08:16 Scaling Compute and Supercritical Learning 13:21 Wall clock time limitations in RL and real-world interactions 16:34 Experience with ARC Institute and DNA neural networks 19:33 Defining the GPT-5 Era 22:46 Evaluating Model Intelligence and Task Difficulty 25:06 Practical Advice for Developers Using GPT-5 31:48 Model Specs 37:21 Challenges in RL Preferences (e.g., try/catch) 39:13 Model Routing and Hybrid Architectures in GPT-5 43:58 GPT-5 pricing and compute efficiency improvements 46:04 Self-Improving Coding Agents and Tool Usage 49:11 On-Device Models and Local vs Remote Agent Systems 51:34 Engineering at OpenAI and Leveraging LLMs 54:16 Structuring Codebases and Teams for AI Optimization 55:27 The Value of Engineers in the Age of AGI 58:42 Current state of AI research and lab diversity 01:01:11 OpenAI’s Prioritization and Focus Areas 01:03:05 Advice for Founders: It's Not Too Late 01:04:20 Future outlook and closing thoughts 01:04:33 Time Capsule to 2045: Future of Compute and Abundance 01:07:07 Time Capsule to 2005: More Problems Will Emerge
--------
--------
The RLVR Revolution — with Nathan Lambert (AI2, Interconnects.ai)
Chapters 00:00:00 Welcome and Guest Introduction 00:01:18 Tulu, OVR, and the RLVR Journey 00:03:40 Industry Approaches to Post-Training and Preference Data 00:06:08 Understanding RLVR and Its Impact 00:06:18 Agents, Tool Use, and Training Environments 00:10:34 Open Data, Human Feedback, and Benchmarking 00:12:44 Chatbot Arena, Sycophancy, and Evaluation Platforms 00:15:42 RLHF vs RLVR: Books, Algorithms, and Future Directions 00:17:54 Frontier Models: Reasoning, Hybrid Models, and Data 00:22:11 Search, Retrieval, and Emerging Model Capabilities 00:29:23 Tool Use, Curriculum, and Model Training Challenges 00:38:06 Skills, Planning, and Abstraction in Agent Models 00:46:50 Parallelism, Verifiers, and Scaling Approaches 00:54:33 Overoptimization and Reward Design in RL 01:02:27 Open Models, Personalization, and the Model Spec 01:06:50 Open Model Ecosystem and Infrastructure 01:13:05 Meta, Hardware, and the Future of AI Competition 01:15:42 Building an Open DeepSeek and Closing Thoughts We first had Nathan on to give us his RLHF deep dive when he was joining AI2, and now he’s back to help us catch up on the evolution to RLVR (Reinforcement Learning with Verifiable Rewards), first proposed in his Tulu 3 paper. While RLHF remains foundational, RLVR has emerged as a powerful approach for training models on tasks with clear success criteria and using verifiable, objective functions as reward signals—particularly useful in domains like math, code correctness, and instruction-following. Instead of relying solely on subjective human feedback, RLVR leverages deterministic signals to guide optimization, making it more scalable and potentially more reliable across many domains. However, he notes that RLVR is still rapidly evolving, especially regarding how it handles tool use and multi-step reasoning. We also discussed the Tulu model series, a family of instruction-tuned open models developed at AI2. Tulu is designed to be a reproducible, state-of-the-art post-training recipe for the open community. Unlike frontier labs like OpenAI or Anthropic, which rely on vast and often proprietary datasets, Tulu aims to distill and democratize best practices for instruction and preference tuning. We are impressed with how small eval suites, careful task selection, and transparent methodology can rival even the best proprietary models on specific benchmarks. One of the most fascinating threads is the challenge of incorporating tool use into RL frameworks. Lambert highlights that while you can prompt a model to use tools like search or code execution, getting the model to reliably learn when and how to use them through RL is much harder. This is compounded by the difficulty of designing reward functions that avoid overoptimization—where models learn to “game” the reward signal rather than solve the underlying task. This is particularly problematic in code generation, where models might reward hack unit tests by inserting pass statements instead of correct logic. As models become more agentic and are expected to plan, retrieve, and act across multiple tools, reward design becomes a critical bottleneck. Other topics covered: - The evolution from RLHF (Reinforcement Learning from Human Feedback) to RLVR (Reinforcement Learning from Verifiable Rewards) - The goals and technical architecture of the Tulu models, including the motivation to open-source post-training recipes - Challenges of tool use in RL: verifiability, reward design, and scaling across domains - Evaluation frameworks and the role of platforms like Chatbot Arena and emerging “arena”-style benchmarks - The strategic tension between hybrid reasoning models and unified reasoning models at the frontier - Planning, abstraction, and calibration in reasoning agents and why these concepts matter - The future of open-source AI models, including DeepSeek, OLMo, and the potential for an “American DeepSeek” - The importance of model personality, character tuning, and the model spec paradigm - Overoptimization in RL settings and how it manifests in different domains (control tasks, code, math) - Industry trends in inference-time scaling and model parallelism Finally, the episode closes with a vision for the future of open-source AI. Nathan has now written up his ambition to build an “American DeepSeek”—a fully open, end-to-end reasoning-capable model with transparent training data, tools, and infrastructure. He emphasizes that open-source AI is not just about weights; it’s about releasing recipes, evaluations, and methods that lower the barrier for everyone to build and understand cutting-edge systems. It would seem the
--------
--------
🕰️ The Oral History of Windsurf (ft. Varun Mohan, Scott Wu, Jeff Wang, Kevin Hou, Anshul R)
This is a recap episode that ends with a short fresh interview on the future of Windsurf + Cognition with Jeff Wang and Scott Wu at the end. As the story of Windsurf as an independent company has come to a dramatic close with Google and Cognition, we’re taking this opportunity to look back at our coverage of Windsurf over the last 3 years. Here’s a brief timeline with related links. Jun 2021 - Exafunction founded Oct 2022 - Codeium pivot https://windsurf.com/blog/beta-launch-announcement Dec 2022 - “Copilot for X” https://www.latent.space/p/what-building-copilot-for-x-really Mar 2023 - Codeium first episode, LS episode 2 https://www.latent.space/p/varun-mohan July 2023 - “How to Make AI UX Your Moat" ****https://www.latent.space/p/ai-ux-moat Mar 2024 - Cognition Devin launch https://www.youtube.com/watch?v=fjHtjT7GO1c Jun 2024 - Scott @ AI Engineer https://www.youtube.com/watch?v=T7NWjoD_OuY Jun 2024 - Kevin @ AI Engineer https://www.youtube.com/watch?v=DuZXbinJ4Uc Nov 2024 - “Enterprise Infra Native” https://www.latent.space/p/enterprise Nov 2024 - Windsurf launch, LS Episode https://www.latent.space/p/windsurf Mar 2025 - Kevin Hou @ AI Engineer https://www.youtube.com/watch?v=bVNNvWq6dKo Jun 2025 - Scott @ AI Engineer https://www.youtube.com/watch?v=MI83buT_23o Jun 2025 - Kevin Hou @ AI Engineer https://www.youtube.com/watch?v=JVuNPL5QO8Q Jul 2025 - Jeff + Scott, CogSurf Episode ← new one, released here. We hope this serves as food for thought for students of history, and a reintroduction to the Latent Space extended universe and backlog, for those of you who are new. Welcome! Timestamps [00:02:07] Mar 2024 Codeium @ LS [00:52:36] Mar 2024 Devin Launch Video [00:54:28] Jun 2024 Codeium @ AIE SF [01:12:14] Jun 2024 Cognition @ AIE SF [01:30:53] Nov 2024 Windsurf Launch Video [01:37:16] Nov 2024 Windsurf Launch @ LS [02:43:10] Feb 2025 Windsurf @ AIE NYC [03:03:27] Jun 2025 Cognition @ AIE SF [03:18:50] June 2025 Windsurf @ AIE SF [03:34:23] July 2025 - Cognition + Windsurf Chapters 00:00:00 Mar 2024 Codeium @ LS 00:52:36 Mar 2024 Devin Launch Video 00:54:28 Jun 2024 Codeium @ AIE SF 01:12:14 Jun 2024 Cognition @ AIE SF 01:30:53 Nov 2024 Windsurf Launch Video 01:37:16 Nov 2024 Windsurf Launch @ LS 02:43:10 Feb 2025 Windsurf @ AIE NYC 03:03:27 Jun 2025 Cognition @ AIE SF 03:18:50 June 2025 Windsurf @ AIE SF 03:34:23 July 2025 - Cognition + Windsurf
--------
--------
AI is Eating Search
ChatGPT handles 2.5B prompts/day and is on track to match Google's daily searches by end of 2026. AI agents don't browse like us—they crave queryable, chunkable data for tools like ChatGPT & Perplexity. A new industry is being born, some are calling it AI SEO, others GEO, but what is clear is that it drives amazing results. Businesses are seeing 2-4x higher conversion from visitors coming from AI compared to traditional search. Robert McCloy is the co-founder of Scrunch AI (https://scrunchai.com/), a fast growing company that helps brands and businesses re-write their content on the fly based on what agents are looking for.
--------
56:21
--------
56:21

More Business podcasts

Trending Business podcasts

About Latent Space: The AI Engineer Podcast

The podcast by and for AI Engineers! In 2024, over 2 million readers and listeners came to Latent Space to hear about news, papers and interviews in Software 3.0. We cover Foundation Models changing every domain in Code Generation, Multimodality, AI Agents, GPU Infra and more, directly from the founders, builders, and thinkers involved in pushing the cutting edge. Striving to give you both the definitive take on the Current Thing down to the first introduction to the tech you'll be using in the next 3 months! We break news and exclusive interviews from OpenAI, Anthropic, Gemini, Meta (Soumith Chintala), Sierra (Bret Taylor), tiny (George Hotz), Databricks/MosaicML (Jon Frankle), Modular (Chris Lattner), Answer.ai (Jeremy Howard), et al. Full show notes always on https://latent.space

Podcast website

Business Technology Entrepreneurship

Listen to Latent Space: The AI Engineer Podcast, Money Made Simple and many other podcasts from around the world with the radio.net app

Get the free radio.net app

Stations and podcasts to bookmark
Stream via Wi-Fi or Bluetooth
Supports Carplay & Android Auto
Many other app features

Open app

Get the free radio.net app

Stations and podcasts to bookmark
Stream via Wi-Fi or Bluetooth
Supports Carplay & Android Auto
Many other app features

Latent Space: The AI Engineer Podcast

Scan code,
download the app,
start listening.