PodcastsEducation80,000 Hours Podcast

80,000 Hours Podcast

The 80,000 Hours team
80,000 Hours Podcast
Latest episode

334 episodes

  • 80,000 Hours Podcast

    I Know How to Build Safe Superintelligence | Yoshua Bengio, the most-cited AI researcher

    07/05/2026 | 2h 33 mins.
    The co-inventor of modern AI and the most cited living scientist believes he's figured out how to ensure AI is honest, incapable of deception, and never goes rogue. Yoshua Bengio – Turing Award Winner and founder of LawZero – is disturbed by the many unintended drives and goals present in today's AIs, their willingness to lie, and ability to tell when they're being tested. AI companies are trying to stamp out these behaviours in a 'cat-and-mouse game' that Yoshua fears they're losing.
    But Yoshua is optimistic: he believes the companies can win this battle decisively with a single rearrangement to how AI models are trained, and has been developing mathematical proofs to back up the claim. The core idea is that instead of training AI to predict what a human would say, or to produce responses we'd rate highly, we should train it to model what's actually true.
    Yoshua argues this new architecture, which he calls 'Scientist AI,' is a small enough change that we could keep almost all the techniques and data we use to train frontier AIs like Claude and ChatGPT. And that the new architecture need not cost more, could be built iteratively, and might be more capable as well as more honest.
    Links to learn more, video, and full transcript: https://80k.info/bengio
    Until recently, the biggest practical objection to Scientist AI was simple: the world wants agents, and Scientist AI isn’t one. But in new research, Yoshua has extended the design and believes the same honest predictor can be turned into a capable agent without losing its "safety guarantees."
    With the Scientist AI proposal on the table, Yoshua argues that it's absurd to race to get current untrustworthy AI models to design their successors, which the leading companies are attempting to do as soon as possible.
    But critics argue the approach wouldn't be so technically solid in practice, and that frontier capabilities are advancing so fast, and cost so much to match, that Scientist AI risks arriving too late to matter.
    Host Rob Wiblin and AI pioneer Yoshua Bengio cover all this and more in today's conversation.
    LawZero is hiring! https://80k.info/lawzero-jobs
    Coefficient Giving is also hiring for a range of AI-related grantmaker roles: https://80k.info/ai-grantmaker-jobs

    This episode was recorded on April 16, 2026.
    Chapters:
    Yoshua Bengio on making AI honest and safe (00:00:00)
    The Scientist AI in plain English (00:02:26)
    Yoshua on how Scientist AI differs from LLMs (00:06:33)
    How the training data works (00:13:55)
    Can this become an agent? (00:20:48)
    Why Yoshua is more optimistic on alignment now (00:31:43)
    Why companies can't stop racing (00:36:05)
    How close to a working prototype? (00:48:27)
    Honest models might be more capable (00:52:40)
    "Reinforcement learning is evil" (01:00:28)
    Scientist AI from guardrail to agent (01:07:31)
    Can safe AI still be competent? (01:11:29)
    How much will this cost? (01:18:17)
    Can it generalise beyond maths and science? (01:22:13)
    A UN for superintelligence (01:37:52)
    Want to work with Yoshua Bengio? (01:49:32)
    Why smart people ignore AI risk (01:53:00)
    Don't let AI build the next AI (01:59:42)
    Why the public doesn't get the real risk (02:10:34)
    Why Yoshua changed his mind about AI risk (02:19:28)
    Video and audio editing: Dominic Armstrong, Milo McGuire, Luke Monsour, and Simon Monsour
    Camera operator: Jeremy Chevillotte
    Production: Nick Stockton, Elizabeth Cox, and Katy Moore
  • 80,000 Hours Podcast

    '95% of AI Pilots Fail': The hidden agenda behind the viral stat that misled millions

    28/04/2026 | 10 mins.
    You might have heard that '95% of corporate AI pilots' are failing. It was one of the most widely cited AI statistics of 2025, parroted by media outlets everywhere. It helped trigger a Nasdaq selloff and became a pillar of the case that 'AI is overhyped'. The problem: it's 100% wrong. And not by accident either.

    If you carefully read the underlying report, ostensibly from MIT, you find the data point in the opposite direction.

    But that was all buried, with the authors instead torturing the results to tell a very different narrative. Why?

    Well, the research likely came with a hidden commercial agenda from the start.

    Learn more, video, and full transcript: https://80k.info/mit-ai-study
    Today Rob Wiblin breaks down how an opaque, conflicted, barely-scrutinised report managed to attract the MIT label, move markets and have a vast impact on global opinion about AI.

    This episode was recorded on February 13, 2026.

    Chapters:
    • The myth (00:00)
    • The math was totally wrong (00:52)
    • The absurd bar for success (01:46)
    • The study ignores its own findings (03:29)
    • The sample was tiny (04:50)
    • The report wasn’t even available to check (05:55)
    • The hidden motives that likely drove this 'research' (06:58)
    • The real lesson (09:28)

    Video and audio editing: Dominic Armstrong, Milo McGuire, Luke Monsour, and Simon Monsour
    Camera operator: Dominic Armstrong
    Production: Nick Stockton, Elizabeth Cox, and Katy Moore
  • 80,000 Hours Podcast

    #242 – Will MacAskill on how we survive the 'intelligence explosion,' AI character, and the case for 'viatopia'

    22/04/2026 | 3h 9 mins.
    Hundreds of millions already turn to AI on the most personal of topics — therapy, political opinions, and how to treat others. And as AI takes over more of the economy, the character of these systems will shape culture on an even grander scale, ultimately becoming “the personality of most of the world’s workforce.”
    So… should they be designed to push us towards the better angels of our nature? Or simply do as we ask? Will MacAskill, philosopher and senior research fellow at Forethought, has been thinking through that and the other thorniest issues that come up in designing an AI personality.
    He’s also been exploring how we might coexist peacefully with the ‘superintelligent AI’ companies are racing to build. He concludes that we should train such systems to be very risk averse, pay them for their work, and build institutions that enable humans to make credible contracts with AIs themselves.
    Will and host Rob Wiblin also discuss what a good world after superintelligence would actually look like — a subject that has received surprisingly little attention from the people working to make it. Will argues that we shouldn’t aim for a specific utopian vision: we don’t know enough about what the best possible future actually is to aim directly for it, and trying to lock in today’s best guesses forever risks baking in errors we can’t yet see.
    Will and Rob explore what we can do to steer towards a good future instead, along with why a coalition of democracies building superintelligence together is safer than any single actor, how absurdly useful ChatGPT is for analytic philosophy, and more.

    Learn more, video, and full transcript: https://80k.info/wm26
    This episode was recorded on February 6, 2026.
    Chapters:
    Cold open (00:00:00)
    Will MacAskill is back — for a 6th time! (00:00:29)
    AIs’ “character” could be vital to securing a good future (00:00:59)
    The panic over sychophancy is justified (00:07:54)
    How opinionated should AI be about ethics? (00:12:59)
    Commercial pressures won’t fully determine AI character (00:29:38)
    Risk-averse AI would rather strike a deal than attempt a coup (00:36:46)
    A coalition of democracies building superintelligence is safer than one doing it alone (01:06:40)
    How selfish agents could fund the common good (01:19:13)
    Why not push for pausing AI development? (01:38:39)
    Effective altruism is making a comeback post-SBF (01:48:18)
    EA in the age of AGI (01:56:15)
    Viatopia: an alternative to utopia (02:05:08)
    The least bad alternative to total utilitarianism? (02:34:42)
    How AI could kickstart a golden age of philosophy (02:58:03)
    Video and audio editing: Dominic Armstrong, Milo McGuire, Luke Monsour, and Simon Monsour
    Music: CORBIT
    Camera operator: Alex Miles
    Production: Elizabeth Cox, Nick Stockton, and Katy Moore
  • 80,000 Hours Podcast

    Risks from power-seeking AI systems (article narration by Zershaaneh Qureshi)

    16/04/2026 | 1h 29 mins.
    Hundreds of prominent AI scientists and other notable figures signed a statement in 2023 saying that mitigating the risk of extinction from AI should be a global priority. At 80,000 Hours, we’ve considered risks from AI to be the world’s most pressing problem since 2016. 
    But what led us to this conclusion? Could AI really cause human extinction? We’re not certain, but we think the risk is worth taking very seriously. 
    In particular, as companies create increasingly powerful AI systems, there’s a concerning chance that:
    These AI systems may develop dangerous long-term goals we don’t want.
    To pursue these goals, they may seek power and undermine the safeguards meant to contain them.
    They may even aim to disempower humanity and potentially cause our extinction.
    This article is written by Cody Fenwick and Zershaaneh Qureshi, and narrated by Zershaaneh Qureshi. It discusses why future AI systems could disempower humanity, what current AI research reveals about behaviours like power-seeking and deception, and how you can help mitigate the dangers.
    You can see the original article — packed with graphs, images, footnotes, and further resources — on the 80,000 Hours website: 
    https://80000hours.org/problem-profiles/risks-from-power-seeking-ai/ 
    Chapters:
    Risks from power-seeking AI systems (00:01:00)
    Introduction (00:01:17)
    Summary (00:03:09)
    Why are the risks from power-seeking AI a pressing world problem? (00:04:04)
    Section 1: Humans will likely build advanced AI systems with long-term goals (00:05:43)
    Section 2: AIs with long-term goals may be inclined to seek power (00:11:32)
    Section 3: These power-seeking AI systems could successfully disempower humanity (00:26:26)
    Section 4. People might create power-seeking AI systems without enough safeguards, despite the risks (00:38:34)
    Section 5: Work on this problem is neglected and tractable (00:47:37)
    Section 6: What are the arguments against working on this problem? (00:59:20)
    Section 7: How you can help (01:25:07)
    Thank you for listening (01:28:56)
    Audio editing: Dominic Armstrong
    Production: Zershaaneh Qureshi, Elizabeth Cox, and Katy Moore
  • 80,000 Hours Podcast

    How scary is Claude Mythos? 303 pages in 21 minutes

    10/04/2026 | 21 mins.
    With Claude Mythos we have an AI that knows when it's being tested, can obscure its reasoning when it wants, and is better at breaking into (and out of) computers than any human alive. Rob Wiblin works through its 244-page System Card and 59-page Alignment Risk Update to explain why: 
    Mythos is a nightmare for computer security
    It has arrived far ahead of schedule
    It might be great news for alignment and safety
    But 3 key problems mean we can’t take its alignment results at face value
    Mythos isn’t building its replacement yet, probably
    Anthropic staff are, for the first time, kinda scared of Claude
    He's losing sleep
    Learn more & full transcript: https://80k.info/mythos
    This episode was recorded on April 9, 2026.
    Chapters:
    Why people are panicking about computer security (01:05)
    Mythos could break out of containment (04:23)
    Anthropic is losing billions in revenue by not releasing Mythos (06:21)
    Mythos is actually the most aligned model to date, except… (07:48)
    Mythos knows when it’s being tested (09:52)
    Mythos can hide its thoughts (11:50)
    Mythos can’t be trusted about whether it’s untrustworthy (14:02)
    Does Mythos advance automated AI R&D? (17:03)
    Mythos scares Anthropic (19:15)
    Video and audio editing: Dominic Armstrong, Milo McGuire, Luke Monsour, and Simon Monsour
    Camera operator: Dominic Armstrong
    Production: Elizabeth Cox, Nick Stockton, and Katy Moore

More Education podcasts

About 80,000 Hours Podcast

The most important conversations about artificial intelligence you won’t hear anywhere else. Subscribe by searching for '80000 Hours' wherever you get podcasts. Hosted by Rob Wiblin, Luisa Rodriguez, and Zershaaneh Qureshi.
Podcast website

Listen to 80,000 Hours Podcast, The Jefferson Fisher Podcast and many other podcasts from around the world with the radio.net app

Get the free radio.net app

  • Stations and podcasts to bookmark
  • Stream via Wi-Fi or Bluetooth
  • Supports Carplay & Android Auto
  • Many other app features

80,000 Hours Podcast: Podcasts in Family