Google AI: Release Notes podcast | Listen online for free

Available Episodes

5 of 12

Building real-time voice applications with Live API
Shrestha Basu Mallick, one of the product leads for the Gemini API, joins host Logan Kilpatrick for a deep dive of Gemini Live API, Google’s real-time, multimodal interface for developers. Learn about how native audio alongside new capabilities like proactive audio and async function calling unlocks the unique power of audio as an interface.Watch on YouTube: https://www.youtube.com/watch?v=4xlwlU6h-wM0:00 - Intro1:18 - Live API Overview3:36 - Why audio is a special modality5:07 - Speed vs. precision in audio6:17 - Controllable and promptable TTS8:31 - What developers are building with the Live API11:14 - URL context and async calling features15:02 - Proactive audio and affective dialog16:55 - Addressing developer feedback21:54 - Live API roadmap23:49 - The role of long context24:57 - What’s next for the Live API26:41 - State of the AI audio market30:10 - Advice for developers getting started with the Live API31:16 - Live API demo38:10 - Demo wrap up and closing
--------
40:14
--------
40:14
Building a frontier AI search experience
Robby Stein, VP of Product for Google Search, joins host Logan Kilpatrick to explore how Search is evolving into a frontier AI product. Their conversation covers the shift from simple keywords to complex, conversational queries, the rise of agentic capabilities that can take action on your behalf, and the vision to help billions of users truly "ask anything." Learn more about the technology behind AI Overviews, AI Mode, Deep Search, and the future of multimodal interaction.Watch on YouTube: https://youtu.be/zUB5A_ezIOUChapters01:07 Search as a Frontier AI Product02:38 Reaching 1.5 Billion Users03:37 What Is AI Mode?04:17 Understanding Query Fan-Out05:18 Balancing Latency and performance with Gemini 2.5 Pro06:51 How Deep Search works09:08 Fine-tuning models for product experience11:24 Shifting user behaviors14:07 The rise of visual search16:52 Speech and conversational AI in Search18:36 Comparing Gemini and Search20:04 Real-time tool use in Search22:52 Evolving the Search interface26:03 Making Search more personal29:15 The agentic future of Search31:15 Agents beyond booking tickets37:11 On-the-fly software creation38:06 Google DeepMind and Search collaboration40:08 What's next for Search
--------
43:16
--------
43:16
Gemini's Multimodality
Ani Baddepudi, Gemini Model Behavior Product Lead, joins host Logan Kilpatrick for a deep dive into Gemini's multimodal capabilities. Their conversation explores why Gemini was built as a natively multimodal model from day one, the future of proactive AI assistants, and how we are moving towards a world where "everything is vision." Learn about the differences between video and image understanding and token representations, higher FPS video sampling, and more. Chapters:0:00 - Intro1:12 - Why Gemini is natively multimodal2:23 - The technology behind multimodal models5:15 - Video understanding with Gemini 2.59:25 - Deciding what to build next13:23 - Building new product experiences with multimodal AI17:15 - The vision for proactive assistants24:13 - Improving video usability with variable FPS and frame tokenization27:35 - What’s next for Gemini’s multimodal development31:47 - Deep dive on Gemini’s document understanding capabilities37:56 - The teamwork and collaboration behind Gemini40:56 - What’s next with model behaviorWatch on YouTube: https://www.youtube.com/watch?v=K4vXvaRV0dw
--------
44:17
--------
44:17
Building Gemini's Coding Capabilities
Connie Fan, Product Lead for Gemini's coding capabilities, and Danny Tarlow, Research Lead for Gemini's coding capabilities, join host Logan Kilpatrick for an in-depth discussion on how the team built one of the world's leading AI coding models. Learn more about the early goals that shaped Gemini's approach to code, the rise of 'vibe coding' and its impact on development, strategies for tackling large codebases with long context and agents, and the future of programming languages in the age of AI.Watch on YouTube: ⁠https://www.youtube.com/watch?v=jwbG_m-X-gE⁠Chapters:0:00 - Intro1:10 - Defining Early Coding Goals6:23 - Ingredients of a Great Coding Model9:28 - Adapting to Developer Workflows11:40 - The Rise of Vibe Coding14:43 - Code as a Reasoning Tool17:20 - Code as a Universal Solver20:47 - Evaluating Coding Models24:30 - Leveraging Internal Googler Feedback26:52 - Winning Over AI Skeptics28:04 - Performance Across Programming Languages33:05 - The Future of Programming Languages36:16 - Strategies for Large Codebases41:06 - Hill Climbing New Benchmarks42:46 - Short-Term Improvements44:42 - Model Style and Taste47:43 - 2.5 Pro’s Breakthrough51:06 - Early AI Coding Experiences56:19 - Specialist vs. Generalist Models⁠⁠
--------
1:00:27
--------
1:00:27
Sergey Brin on the Future of AI & Gemini
A conversation with Sergey Brin, co-founder of Google and computer scientist working on Gemini, in reaction to a year of progress with Gemini.Watch on YouTube: https://www.youtube.com/watch?v=o7U4DV9Fkc0Chapters0:20 - Initial reactions to I/O2:00 - Focus on Gemini’s core text model4:29 - Native audio in Gemini and Veo 38:34 - Insights from model training runs10:07 - Surprises in current AI developments vs. past expectations14:20 - Evolution of model training16:40 - The future of reasoning and Deep Think20:19 - Google’s startup culture and accelerating AI innovation24:51 - Closing
--------
27:19
--------
27:19

More Science podcasts

Trending Science podcasts

About Google AI: Release Notes

Ever wondered what it's really like to build the future of AI? Join host Logan Kilpatrick for a deep dive into the world of Google AI, straight from the minds of the builders. We're pulling back the curtain on the latest breakthroughs, sharing the unfiltered stories behind the tech, and answering the questions you've been dying to ask. Whether you're a seasoned developer or an AI enthusiast, this podcast is your backstage pass to the cutting-edge of AI technology. Tune in for: - Exclusive interviews with AI pioneers and industry leaders. - In-depth discussions on the latest AI trends and developments. - Behind-the-scenes stories and anecdotes from the world of AI. - Unfiltered insights and opinions from the people shaping the future. So, if you're ready to go beyond the headlines and get the real scoop on AI, join Logan Kilpatrick on Google AI: Release Notes.

Podcast website

Science Technology

Listen to Google AI: Release Notes, Making Sense with Sam Harris and many other podcasts from around the world with the radio.net app

Get the free radio.net app

Stations and podcasts to bookmark
Stream via Wi-Fi or Bluetooth
Supports Carplay & Android Auto
Many other app features

Open app

Get the free radio.net app

Stations and podcasts to bookmark
Stream via Wi-Fi or Bluetooth
Supports Carplay & Android Auto
Many other app features

Google AI: Release Notes

Scan code,
download the app,
start listening.