Powered by RND

How I AI

Claire Vo
How I AI
Latest episode

Available Episodes

5 of 28
  • Evals, error analysis, and better prompts: A systematic approach to improving your AI products | Hamel Husain (ML engineer)
    Hamel Husain, an AI consultant and educator, shares his systematic approach to improving AI product quality through error analysis, evaluation frameworks, and prompt engineering. In this episode, he demonstrates how product teams can move beyond “vibe checking” their AI systems to implement data-driven quality improvement processes that identify and fix the most common errors. Using real examples from client work with Nurture Boss (an AI assistant for property managers), Hamel walks through practical techniques that product managers can implement immediately to dramatically improve their AI products.What you’ll learn:1. A step-by-step error analysis framework that helps identify and categorize the most common AI failures in your product2. How to create custom annotation systems that make reviewing AI conversations faster and more insightful3. Why binary evaluations (pass/fail) are more useful than arbitrary quality scores for measuring AI performance4. Techniques for validating your LLM judges to ensure they align with human quality expectations5. A practical approach to prioritizing fixes based on frequency counting rather than intuition6. Why looking at real user conversations (not just ideal test cases) is critical for understanding AI product failures7. How to build a comprehensive quality system that spans from manual review to automated evaluation—Brought to you by:GoFundMe Giving Funds—One account. Zero hassle: https://gofundme.com/howiaiPersona—Trusted identity verification for any use case: https://withpersona.com/lp/howiai—Where to find Hamel Husain:Website: https://hamel.dev/Twitter: https://twitter.com/HamelHusainCourse: https://maven.com/parlance-labs/evalsGitHub: https://github.com/hamelsmu—Where to find Claire Vo:ChatPRD: https://www.chatprd.ai/Website: https://clairevo.com/LinkedIn: https://www.linkedin.com/in/clairevo/X: https://x.com/clairevo—In this episode, we cover:(00:00) Introduction to Hamel Husain(03:05) The fundamentals: why data analysis is critical for AI products(06:58) Understanding traces and examining real user interactions(13:35) Error analysis: a systematic approach to finding AI failures(17:40) Creating custom annotation systems for faster review(22:23) The impact of this process(25:15) Different types of evaluations(29:30) LLM-as-a-Judge(33:58) Improving prompts and system instructions(38:15) Analyzing agent workflows(40:38) Hamel’s personal AI tools and workflows(48:02) Lighting round and final thoughts—Tools referenced:• Claude: https://claude.ai/• Braintrust: https://www.braintrust.dev/docs/start• Phoenix: https://phoenix.arize.com/• AI Studio: https://aistudio.google.com/• ChatGPT: https://chat.openai.com/• Gemini: https://gemini.google.com/—Other references:• Who Validates the Validators? Aligning LLM-Assisted Evaluation of LLM Outputs with Human Preferences: https://dl.acm.org/doi/10.1145/3654777.3676450• Nurture Boss: https://nurtureboss.io• Rechat: https://rechat.com/• Your AI Product Needs Evals: https://hamel.dev/blog/posts/evals/• A Field Guide to Rapidly Improving AI Products: https://hamel.dev/blog/posts/field-guide/• Creating a LLM-as-a-Judge That Drives Business Results: https://hamel.dev/blog/posts/llm-judge/• Lenny’s List on Maven: https://maven.com/lenny—Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected].
    --------  
    54:48
  • “I’m incapable of doing my job without AI”: How this top PM uses Claude + ChatGPT as his second brain
    Amir Klein is a product manager at Monday.com, leading their AI agents initiative. Despite taking two months of paternity leave, he ranked #4 out of 90 PMs in AI tool usage at his company. In this episode, Amir reveals how he’s become “highly dependent and maybe incapable” of doing his job without AI, showing his custom GPT workflows that help him manage context switching, analyze customer feedback, improve his writing, and prepare for product interviews.What you’ll learn:How to create project-specific “second brains” in Claude and ChatGPT that hold context for you across multiple workstreamsA step-by-step process for using Claude to build a Reddit scraper that gathers thousands of customer conversations, without coding expertiseHow to analyze large datasets of customer feedback using AI to identify patterns, priorities, and key discussion pointsA workflow for creating custom GPTs that help you improve specific skills based on manager feedbackTechniques for using GPT voice mode to conduct realistic mock interviews that provide candid feedback on your responsesWhy “everything is text” should be your mindset when feeding information into AI tools, from PDFs to slide decksHow to use AI to respond quickly to stakeholder requests even when you’re context switching between multiple projects—Brought to you by:GoFundMe Giving Funds—One account. Zero hassle.Miro—A collaborative visual platform where your best work comes to life—Where to find Amir Klein:LinkedIn: https://www.linkedin.com/in/amir-klein-9b8444189/—Where to find Claire Vo:ChatPRD: https://www.chatprd.ai/Website: https://clairevo.com/LinkedIn: https://www.linkedin.com/in/clairevo/X: https://x.com/clairevo—In this episode, we cover:(00:00) Introduction to Amir(03:11) Using custom GPT project folders as “second brains”(06:24) Building a Reddit scraper with Claude’s help(11:02) Analyzing 34,000 rows of Reddit conversations(14:06) How to build effective custom GPT knowledge bases(18:04) Creating a custom writing coach from Lenny’s Newsletter(21:53) Using AI for professional development and feedback(24:08) Preparing for product interviews with GPT voice mode(31:49) Additional use cases for voice mode(33:04) Recap of Amir’s AI workflows(35:43) Lightning round and final thoughts—Tools referenced:• Claude: https://claude.ai/• ChatGPT: https://chat.openai.com/• Reddit API: https://www.reddit.com/dev/api/• Python: https://www.python.org/• Slack: https://slack.com/—Other references:• Wes Kao: https://weskao.com/• Become a better communicator: Specific frameworks to improve your clarity, influence, and impact | Wes Kao (coach, entrepreneur, advisor): https://www.lennysnewsletter.com/p/become-a-better-communicator-specific• On Writing Well by William Zinsser: https://www.amazon.com/Writing-Well-Classic-Guide-Nonfiction/dp/0060891548• The Elements of Style by Strunk and White: https://www.amazon.com/Elements-Style-Fourth-William-Strunk/dp/020530902X• Exponent YouTube channel: https://www.youtube.com/c/ExponentTV• monday.com: https://monday.com/—Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected].
    --------  
    38:50
  • The secret to better AI prototypes: Why Tinder’s CPO starts with JSON, not design | Ravi Mehta (product advisor, previously EIR at Reforge)
    Ravi Mehta, now a product advisor, has built and scaled products used by millions. His past roles include Chief Product Officer at Tinder, Entrepreneur in Residence at Reforge, and senior product leadership positions at Facebook, TripAdvisor, and Xbox. In this episode, Ravi demonstrates his data-driven approach to AI prototyping that produces dramatically better results than traditional "vibe prototyping." He also shares his structured framework for generating professional-quality images in Midjourney that look like they were shot by a professional photographer.What you’ll learn:Why most product managers and designers are “vibe prototyping” with AI and getting mediocre resultsHow to use JSON data models instead of design systems as the foundation for better AI prototypesA simple three-part framework for structuring Midjourney prompts to get professional-quality photosHow to use Claude and Unsplash’s MCP server to generate realistic data and images for your prototypesWhy real data (not Lorem Ipsum) is critical for getting meaningful feedback from stakeholdersThe film stock “cheat code” that instantly elevates your AI-generated photos—Brought to you by:Google Gemini—Your everyday AI assistantPersona—Trusted identity verification for any use case—Where to find Ravi Mehta:Website: https://www.ravi-mehta.com/Reforge: https://www.reforge.com/profiles/ravi-mehtaLinkedIn: https://www.linkedin.com/in/ravimehta/X: https://x.com/ravi_mehta—Where to find Claire Vo:ChatPRD: https://www.chatprd.ai/Website: https://clairevo.com/LinkedIn: https://www.linkedin.com/in/clairevo/X: https://x.com/clairevo—In this episode, we cover:(00:00) Introduction to Ravi and data-driven prototyping(02:31) The problem with “vibe prototyping” in product development(04:18) Spec-driven prototyping vs. data-driven prototyping(05:27) Demo: Spec-driven approach to prototyping(08:26) Limitations of the basic AI prototype approach(11:24) The data-driven prototyping approach explained(12:08) Demo: Data-driven prototyping(17:45) Creating a prototype with the generated JSON data(23:33) Comparing the quality difference between approaches(26:44) Modifying the prototype(28:53) Benefits of this approach(34:40) Structured Midjourney prompting(36:20) The subject-setting-style framework for better image prompts(44:27) Using camera metadata to refine your results(48:54) Lightning round and final thoughts—Tools referenced:• Claude: https://claude.ai/• Reforge Build: https://www.reforge.com/build• Midjourney: https://www.midjourney.com/• Unsplash MCP: https://github.com/okooo5km/unsplash-mcp-server-go?utm_source=chatgpt.com—Other references:• Reforge AI Strategy Course: https://www.reforge.com/courses/ai-strategy—Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected].
    --------  
    54:38
  • The beginner's guide to coding with Cursor | Lee Robinson (Head of AI education)
    Lee Robinson is the head of AI education at Cursor, where he teaches people how to build software with AI. Previously, he helped build Vercel and Next.js as an early employee. In this episode, he demonstrates how Cursor's AI-powered code editor bridges the gap between beginners and experienced developers through automated error fixing, parallel task execution, and writing assistance. Lee walks through practical examples of using Cursor's agent to improve code quality, manage technical debt, and even enhance your writing by eliminating common AI patterns and clichés.What you'll learn:1. How to use Cursor's AI agent to automatically detect and fix linting errors without needing to understand complex terminal commands2. A workflow for running parallel coding tasks by focusing on your main work while the agent handles secondary features in the background3. Why setting up typed languages, linters, formatters, and tests creates guardrails that help AI tools generate better code4. How to create custom commands for code reviews that automatically check for security issues, test coverage, and other quality concerns5. A technique for improving your writing by creating a custom prompt with banned words and phrases that eliminates AI-generated patterns6. Strategies for managing context in AI conversations to maintain high-quality responses and avoid degradation7. Why looking at code—even when you don't fully understand it—is one of the best ways to learn programming—Brought to you by:Google Gemini—Your everyday AI assistantPersona—Trusted identity verification for any use case—Where to find Lee Robinson:Twitter/X: https://twitter.com/leeerobWebsite: https://leerob.com—Where to find Claire Vo:ChatPRD: https://www.chatprd.ai/Website: https://clairevo.com/LinkedIn: https://www.linkedin.com/in/clairevo/X: https://x.com/clairevo—In this episode, we cover:(00:00) Introduction to Lee(02:04) Understanding Cursor's three-panel interface(06:27) The importance of typed languages, linters, and tests(11:28) Demo: Using the agent to automatically fix lint errors(15:17) Running parallel coding tasks with the agent(18:50) Setting up custom rules(23:24) Understanding the different AI models(24:48) Micro-slicing agent chats for better success(27:22) Tips for effective agent usage(29:00) Using AI to improve your writing(35:47) Lightning round and final thoughts—Tools referenced:• Cursor: https://cursor.com/• ChatGPT: https://chat.openai.com/• JavaScript: https://developer.mozilla.org/en-US/docs/Web/JavaScript• Python: https://www.python.org/• TypeScript: https://www.typescriptlang.org/• Git: https://git-scm.com/—Other references:• Linting: https://en.wikipedia.org/wiki/Lint_(software)—Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected].
    --------  
    45:27
  • How I built an Apple Watch workout app using Cursor and Xcode (with zero mobile-app experience)
    Terry Lin is a product manager and developer who built Cooper’s Corner, an AI-powered fitness tracking app that works across iPhone and Apple Watch. Frustrated with traditional fitness apps that require extensive setup and manual logging, Terry created a solution that lets users simply speak their exercises, weights, and reps. The app automatically structures this data and provides analytics on workout consistency and progress. In this episode, Terry shares his vibe-coding process using Cursor and Xcode and explains how he optimizes his codebase for AI collaboration.What you’ll learn:1. How Terry built a voice-powered fitness tracker that works across iPhone and Apple Watch2. His “dual-wielding” workflow, using Cursor for coding and Xcode for building and debugging3. Terry’s three-step process for working with AI: create, review, and execute4. Why optimizing your codebase for AI collaboration can dramatically improve productivity5. How to use index cards and GPT-4 to rapidly prototype mobile interfaces6. A technique for “vibe refactoring” that keeps code organized and optimized for both human and AI readability7. His “rubber duck” technique to better understand generated code and improve your learning process—Brought to you by:Paragon—Ship every SaaS integration your customers wantMiro—A collaborative visual platform where your best work comes to life—Where to find Terry Lin:LinkedIn: https://www.linkedin.com/in/itsmeterrylin/GitHub: https://github.com/itsmeterrylin—Where to find Claire Vo:ChatPRD: https://www.chatprd.ai/Website: https://clairevo.com/LinkedIn: https://www.linkedin.com/in/clairevo/X: https://x.com/clairevo—In this episode, we cover:(00:00) Introduction to Terry and his fitness tracker app(02:30) Demo of the voice-powered workout tracking across devices(06:40) Analytics and history views for tracking consistency(07:20) Dual-wielding Cursor and Xcode for mobile development(09:05) Building a v1 using AI tools(11:19) A three-step AI workflow: create, review, execute(19:38) Token conservation and vibe refactoring explained(23:25) Optimizing file sizes for better AI performance(25:28) Using “rubber duck” rules to learn from AI-generated code(28:13) Prototyping with index cards and GPT-4(31:20) Human creativity and the last 10%(32:29) Lightning round and final thoughts—Tools referenced:• Cursor: https://cursor.sh/• Xcode: https://developer.apple.com/xcode/• GPT-4: https://openai.com/gpt-4• UX Pilot: https://uxpilot.ai/• Figma: https://www.figma.com/• Linear: https://linear.app/—Other references:• Apple UI Kit: https://developer.apple.com/design/human-interface-guidelines/—Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected].
    --------  
    36:16

More Technology podcasts

About How I AI

How I AI, hosted by Claire Vo, is for anyone wondering how to actually use these magical new tools to improve the quality and efficiency of their work. In each episode, guests will share a specific, practical, and impactful way they’ve learned to use AI in their work or life. Expect 30-minute episodes, live screen sharing, and tips/tricks/workflows you can copy immediately. If you want to demystify AI and learn the skills you need to thrive in this new world, this podcast is for you.
Podcast website

Listen to How I AI, Hard Fork and many other podcasts from around the world with the radio.net app

Get the free radio.net app

  • Stations and podcasts to bookmark
  • Stream via Wi-Fi or Bluetooth
  • Supports Carplay & Android Auto
  • Many other app features
Social
v7.23.9 | © 2007-2025 radio.de GmbH
Generated: 10/14/2025 - 4:52:07 PM