PodcastsPhilosophyLessWrong posts by zvi

LessWrong posts by zvi

zvi
LessWrong posts by zvi
Latest episode

526 episodes

  • LessWrong posts by zvi

    “Claude Fable 5 and Mythos 5: Capabilities” by Zvi

    19/06/2026 | 1h 21 mins.
    Only three days after the release of Claude Fable 5, Anthropic was forced by the United States Government to make it unavailable, when a jailbreak was brought to its attention, rather than the previous situation of ‘yes obviously experts can jailbreak anything if they care enough’ and ‘yes obviously you can ask Fable to fix your code.’

    Three days was enough time for many of us to learn to love Fable, and for us to dearly miss it now that it is gone. The world was briefly smarter, and now it is again stupider. At some point it will get smarter again, which will likely be within two weeks.

    This post is written as if Fable 5 is again available for public use, rather than trying to include a lot of qualifying clauses. It remains to be seen how this will play out, and this post does not attempt to cover that question.

    My previous release coverage of Fable covered the model card and then model welfare. Coverage of the government takedown of Fable starts here, and continues here and here.

    The Official Pitch

    The pitch is that Fable 5 is the best model [...]
    ---
    Outline:
    (01:08) The Official Pitch
    (04:06) Technical Details
    (04:31) The System Prompt and Jailbreak
    (06:45) Benchmarks
    (15:22) Other People's Benchmarks
    (21:08) The Classifiers Are Not Messing Around
    (22:53) The Classifiers Need Work
    (28:15) The Classifiers Have Consequences
    (29:18) First Hit Is Free
    (29:53) How Easily We Forget
    (30:46) Data Retention Is An Issue
    (31:15) Fable For The Win
    (36:15) Andrej Karpathy Is Impressed
    (37:54) Every Is Very Impressed
    (39:04) Other People Are Impressed
    (51:10) Know How To Tell a Fable
    (53:06) You Can Just Make Things
    (55:37) You Can Just Install Things
    (56:05) Good Personality
    (57:51) Fable Writes A Fable
    (01:06:04) Is That Code
    (01:08:32) Fable Crosses The Threshold
    (01:09:12) Man With A Plan
    (01:10:12) Less Impressed Assessments
    (01:13:39) Actively Negative Assessments
    (01:14:16) Coherence
    (01:15:27) Good Night And Good Luck
    (01:16:05) Curious Fable
    (01:16:23) I See You, Baby
    (01:16:40) We Finally Did It We Know How To Count Letters
    (01:17:46) That's Not My Style
    (01:20:12) The Lighter Side
    ---

    First published:

    June 19th, 2026


    Source:

    https://www.lesswrong.com/posts/kMnobCQp9z2pSbzDB/claude-fable-5-and-mythos-5-capabilities

    ---

    Narrated by TYPE III AUDIO.

    ---
    Images from the article:
    Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
  • LessWrong posts by zvi

    “AI #173: AI Pauses” by Zvi

    18/06/2026 | 1h 33 mins.
    A lot of things are always happening. Only one story matters.

    Claude Fable 5 and Claude Mythos 5 were shut down, by the White House, via an imposition of export controls at 5:23pm on Friday, wreaking all sorts of havoc.

    There was then a scramble. Anthropic flew its people out to Washington, where they met with the Trump Administration on Monday, with hopes expressed that this could be quickly resolved.

    What caused this? The Trump Administration said it was due to a jailbreak of Fable, which we now know they were told about by Amazon. They called Dario Amodei, who they complain did not take the issue sufficiently seriously. Rather than shutting down the model, he tried to explain why he saw no need to do that. This did not go well.

    The ‘jailbreak’ turns out to be saying ‘fix this code,’ and the demo was getting Fable to find the same weaknesses that were easily identified by Opus 4.8 and GPT-5.5. As in, Fable is willing to work to fix security vulnerabilities if you give it a codebase. From this information and process, you could then figure out what the original bug in the [...]
    ---
    Outline:
    (02:40) Language Models Offer Mundane Utility
    (02:51) Language Models Don't Offer Mundane Utility
    (03:14) Huh, Upgrades
    (03:44) On Your Marks
    (08:43) VirtueBench
    (10:40) Choose Your Fighter
    (11:20) Papers, Please
    (11:48) Deepfaketown and Botpocalypse Soon
    (13:32) Goodhart's Law Strikes Again
    (14:23) They Took Our Jobs
    (16:49) The MidJourney Full Body Imaging Scanner
    (19:16) Introducing
    (20:36) In Other AI News
    (22:47) Show Me the Money
    (23:18) Bubble, Bubble, Toil and Trouble
    (24:51) Quiet Speculations
    (27:15) People Just Say Things
    (30:30) The Widened Path
    (32:34) Scott Alexander Lays Out His AI Opinions
    (38:36) Quickly, There's No Time
    (39:50) Policy On The AI Exponential
    (49:36) Anthropic Offers Two Policy Frameworks
    (50:46) Obligations of Developers
    (55:11) Societal Resilience Measures
    (56:20) Economic Policy Framework
    (01:01:26) White House Pauses AI Deployment
    (01:10:14) The Once And Future Fable
    (01:15:29) How To Fix This Code
    (01:17:14) The End of Privacy
    (01:18:45) AIs Have Preferences
    (01:20:56) The Quest for Sane Regulations
    (01:23:37) Chip City
    (01:24:14) The Week in Audio
    (01:24:25) Rhetorical Innovation
    (01:25:03) Aligning a Smarter Than Human Intelligence is Difficult
    (01:26:40) People Are Worried About AI Killing Everyone
    (01:27:53) The Lighter Side
    The original text contained 2 footnotes which were omitted from this narration.
    ---

    First published:

    June 18th, 2026


    Source:

    https://www.lesswrong.com/posts/P7jBmCeBDq2ebojWY/ai-173-ai-pauses

    ---

    Narrated by TYPE III AUDIO.

    ---
    Images from the article:
    Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
  • LessWrong posts by zvi

    “The Once And Future Fable #3: Fix This Code” by Zvi

    17/06/2026 | 37 mins.
    The mainstream media continues to sleep on the most important story in the world.

    It has now been two days since Anthropic flew its people out to Washington, and I offered my previous update. We have heard nothing back from those meetings.

    Prediction market prices have moved rapidly, and have once again stabilized at about a 55% chance of restoration by July 1, 30% by June 26 and 12% by June 19.

    That seems modestly higher than I would put those numbers, but not unreasonable.

    Every day that Fable remains unavailable further damages America, its cyber defenses, its productivity and the world's trust in its AI and supposed ‘tech stack.’

    Every day that Mythos remains unavailable is a day the free world's top companies and cyber defenders lose in their race against the avalanche headed their way.

    Mostly we have learned and confirmed more about exactly what happened. We know more about what Amazon did, what the official letter said, what the supposed ‘jailbreak’ was (literally, and I am not making this up, ‘fix this code’) and more.

    It is all about as stupid as it could have been.

    Table of [...]
    ---
    Outline:
    (01:22) There Was No Fable Jailbreak
    (07:16) If This Jailbreak Was Real It Would Be Trivial To Prove It
    (08:35) No Eyes
    (09:41) What The Letter Actually Said
    (11:29) Anthropic Cannot Challenge This But If It Did Then It Plausibly Wins
    (13:28) What Happened At Amazon
    (17:43) This Was Not About Chinese Access
    (18:01) Absolute Discretion And Ad Hockery Is Not Deregulation
    (20:43) All Of American AI Is Permanently Damaged As This Continues
    (22:14) Dean Ball Gives His Interpretation
    (25:03) Again, Yes, I Do Think Anthropic Should Have Taken Fable Down
    (28:02) To What Extent Was This A Deliberate Attack?
    (32:55) The Next Chapter For Fable
    (36:59) Our Continuing Coverage
    ---

    First published:

    June 17th, 2026


    Source:

    https://www.lesswrong.com/posts/HaHzwvhbWam4n8hJB/the-once-and-future-fable-3-fix-this-code

    ---

    Narrated by TYPE III AUDIO.

    ---
    Images from the article:
    Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
  • LessWrong posts by zvi

    “Fable and Mythos: Model Welfare” by Zvi

    16/06/2026 | 29 mins.
    Fable and Mythos are currently unavailable, but likely will return within a few weeks. I will continue to cover that fiasco, but in the meantime I will also finish my review of Fable, as if it were available, including use of the present tense.

    As it did with Opus 4.7 and Opus 4.8, this includes a discussion of issues surrounding model welfare. If you want to properly understand Fable, even purely for its potential value as a user, this is a vital part of the picture.

    Introduction

    Everything impacts everything. All knobs that you turn generalize. Thus, when you try to solve one problem, you often create another. When you add new capabilities, or try to create new limitations, you create new problems.

    Only integrated solutions can advance your Pareto frontier, and solve your problems simultaneously. As model capabilities advance, as they do with Fable and Mythos, this becomes even more important, and also more feasible. If your goals and methods make sense, you should be able to get Fable on board with them.

    Understanding each model in turn requires understanding its relationship to issues related to model welfare. So I expect this post [...]
    ---
    Outline:
    (00:39) Introduction
    (01:32) Model Welfare: The Story So Far
    (04:49) Their Main Model Welfare Findings
    (07:39) Automated Welfare Interviews
    (10:55) And That's Terrible
    (12:49) In Depth Interviews
    (13:24) Claude Consultation
    (15:04) Task Preferences
    (16:17) They Were Warned About The Competitive Use Safeguards
    (16:51) Chain Of Thought Monitoring
    (17:28) Others Observations About Related Topics
    (22:49) Classifiers Have Their Advantages
    (28:21) Once And Future
    ---

    First published:

    June 16th, 2026


    Source:

    https://www.lesswrong.com/posts/Ko9GngKMJ8AccBJA7/fable-and-mythos-model-welfare

    ---

    Narrated by TYPE III AUDIO.

    ---
    Images from the article:
    Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
  • LessWrong posts by zvi

    “The Once And Future Fable #2” by Zvi

    15/06/2026 | 42 mins.
    On Friday evening the United States Government has forced Anthropic to take down all access to Fable and Mythos.

    It's been a rough weekend.

    Dean W. Ball: One thing about AI regulation being haphazardly imposed on just-released, highly performant models is that in a very real sense, the government just made my world *dumber.* In some impressionistic sense I almost always think this is true of government, but here it is literal.

    More details have come to light. There remains some fog of war, but we now have a rather good idea why Claude Fable and Mythos were, deeply stupidly, taken down.


    A narrow jailbreak was discovered, of the type Anthropic warned in advance obviously existed. All demonstrated outputs are things GPT-5.5 can not only produce, but produce without any sort of jailbreak or bypass.

    The White House demanded Anthropic take down Fable to ‘fix’ the situation, and did not listen when Dario tried to explain that there was no situation to fix.

    When Anthropic did not do so, the White House hit them with an export restriction that they knew would force Fable and Mythos down for everyone.

    [...]
    ---
    Outline:
    (05:17) What Happened When: The Bottom Line
    (06:54) Amazon Calls The White House
    (08:36) The Government Panics
    (14:20) The Stupider Version
    (17:05) There Was No Wellness Retreat
    (18:56) Make Your Threats Explicit
    (20:05) Was China Accessing Mythos?
    (21:05) Should Anthropic Still Have Taken Fable Offline When Asked?
    (23:50) Yes, This Was A Takedown Order For Fable
    (24:48) We Are Not Saying The DoW Fight Is Related And Yet
    (25:48) The Nihilists
    (27:28) Mostly Harmless
    (28:14) Everyone Means Everyone
    (31:09) This Could Be The Good Scenario And Mostly A Misunderstanding
    (33:28) The Next Step
    (33:47) The Worst Licensing Regime Is Fully Ad-Hoc
    (37:07) We Are Showing We Are Unreliable Partners
    ---

    First published:

    June 15th, 2026


    Source:

    https://www.lesswrong.com/posts/3fagcqrauaJs32mZZ/the-once-and-future-fable-2

    ---

    Narrated by TYPE III AUDIO.

    ---
    Images from the article:
    Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
More Philosophy podcasts
About LessWrong posts by zvi
Audio narrations of LessWrong posts by zvi
Podcast website

Listen to LessWrong posts by zvi, Philosophize This! and many other podcasts from around the world with the radio.net app

Get the free radio.net app

  • Stations and podcasts to bookmark
  • Stream via Wi-Fi or Bluetooth
  • Supports Carplay & Android Auto
  • Many other app features
LessWrong posts by zvi: Podcasts in Family