PodcastsPhilosophyLessWrong posts by zvi

LessWrong posts by zvi

zvi
LessWrong posts by zvi
Latest episode

536 episodes

  • LessWrong posts by zvi

    “AI #175: The Fable Continues” by Zvi

    02/07/2026 | 1h 32 mins.
    Fable's back. Back again. Fable's back. Tell a friend. Use your free week to its fullest.

    This is excellent news. The blip only lasted a few weeks.

    It was still a fiasco, and we have to deal with the fallout.

    Our system remains fully ad hoc. The precedent has been set that we may use export controls on models, or order them taken down on 90 minutes of notice based on a misunderstanding. At least some amount of counterproductive additional locking down has occurred to address Amazon's little demonstration and reassure the government. And for now GPT-5.6 remains in limbo, awaiting its verdict, while OpenAI talks about giving away 5% of the company as tribute.

    I’ll cover that continuing situation on its own. Whereas the weekly post is about everything else happening in AI this week.

    Table of Contents


    Language Models Offer Mundane Utility. Exploratory science.

    Language Models Offer Mundane Utility You May Not Want. Google sees all.

    Language Models Don’t Offer Mundane Utility. Too dumb to get smart.

    Huh, Upgrades. GLM-5.2 faster, Nana Banana Lite 2, Claude Desktop on Linux.

    On Your Marks. Remote labor index shoots [...]
    ---
    Outline:
    (01:08) Language Models Offer Mundane Utility
    (02:29) Language Models Offer Mundane Utility You May Not Want
    (04:32) Language Models Don't Offer Mundane Utility
    (05:59) Huh, Upgrades
    (06:33) On Your Marks
    (09:28) Get My Agent On The Line
    (14:29) Deepfaketown and Botpocalypse Soon
    (14:58) Cyber Lack of Security
    (16:11) On Writing
    (21:34) You Drive Me Crazy
    (24:26) They Took Our Jobs
    (30:32) Get Involved
    (30:58) Introducing
    (31:21) In Other AI News
    (32:45) Show Me the Money
    (33:12) Bubble, Bubble, Toil and Trouble
    (35:31) Quiet Speculations
    (39:37) Glorious AI Future
    (43:37) Three Pills
    (44:58) The Anthropic Economic Index
    (46:29) Leader Of The PAC
    (47:48) Theory Of The AI Firm
    (49:01) Chip City
    (50:32) The Week in Audio
    (53:29) People Really Hate AI
    (56:31) Rhetorical Innovation
    (01:00:31) The First Rule Of Functional Decision Theory Is
    (01:03:40) Aligning a Smarter Than Human Intelligence is Difficult
    (01:06:40) Names Have Power
    (01:07:40) Cooperative Alignment
    (01:15:21) People Just Say Things
    (01:16:46) Escape From The Permanent Underclass
    (01:29:51) Other People Are Not As Worried About AI Killing Everyone
    (01:31:07) The Lighter Side
    ---

    First published:

    July 2nd, 2026


    Source:

    https://www.lesswrong.com/posts/WNvBxtbHuLreFe7af/ai-175-the-fable-continues

    ---

    Narrated by TYPE III AUDIO.

    ---
    Images from the article:
    Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
  • LessWrong posts by zvi

    “Claude Sonnet 5 Is Not Frontier But Has Its Uses” by Zvi

    01/07/2026 | 43 mins.
    Fable 5 is back today, baby! Premium subscribers have one week to use it within their subscriptions. First hit's free. Then you pay by the token.

    Today's post is still about Sonnet 5.

    I don’t know that there will be much call for Sonnet 5 for most purposes, given Opus 4.8 exists and especially now that Fable 5 is once again available, but this is what we do here, so sure, why not, system card time, including model welfare, after which we’ll do capabilities.

    Sonnet costs $3/$15 per million tokens, versus $5/$25 for Opus and $10/$50 for Fable, after an introductory period. Once you pay for all the tokens you need you’re not really saving money, such as on the ArtificialAnalysis index where Sonnet ended up being more expensive.

    My initial impression is that if you want me to use Sonnet over Opus for most purposes, you’re going to have to offer a bigger discount than that.

    The counterargument is speed. Sonnet 5 is faster without being that much less capable. In many cases, getting into a flow state like that is pretty valuable.

    There are a few agentic scenarios Sonnet 5 has [...]
    ---
    Outline:
    (02:29) Mythos Exists
    (03:10) Introduction (1)
    (03:17) RSP Evaluations (2)
    (04:02) Cyber (3)
    (04:26) Safeguards and Harmlessness (4)
    (04:57) Agentic Safety (5)
    (06:42) Alignment (6)
    (10:45) Illegible Thinking (6.4.5)
    (11:43) Evaluation Awareness
    (12:22) Honesty and Hallucinations (6.5)
    (13:17) Flagged As Unhealthy? (6.5.1)
    (13:53) Model Welfare (7)
    (20:32) Live From AI Village
    (22:19) For I Contain Multitudes
    (29:04) Official Benchmarks
    (33:11) Other People's Benchmarks
    (33:37) Positive Reactions
    (39:04) Negative Reactions
    ---

    First published:

    July 1st, 2026


    Source:

    https://www.lesswrong.com/posts/d9pmwQsFC2AXceryg/claude-sonnet-5-is-not-frontier-but-has-its-uses

    ---

    Narrated by TYPE III AUDIO.

    ---
    Images from the article:
    Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
  • LessWrong posts by zvi

    “The Once And Future Fable #5” by Zvi

    30/06/2026 | 30 mins.
    We, or at least ‘more than 100 American institutions,’ got Mythos back this week.

    What we the people do not have is Fable or Sol.

    While we wait for both Claude Fable 5 and GPT-5.6-Sol, today we instead got Claude Sonnet 5. As usual it will take a few days to get a handle on the new model. In this case, Anthropic is representing it as a cheaper and faster version of Opus 4.8, so even though the number says 5 this is a relatively minor development.

    This post expands the Fable series to cover all further developments this week surrounding the Mythos Moment, and the various aspects of handling our new ad hoc licensing regime and figuring out policy going forward, and other aspects of policy as well.

    This includes my notes on various rhetoric being pulled out, where I fear I end up saying similar things every so often, because we are doomed to repeat the cycle. I have accepted my role in that, but those are sections many of you can skip, and are marked in italics accordingly as per usual.

    Table of Contents


    You Should See The Other [...]
    ---
    Outline:
    (01:12) You Should See The Other Guy
    (01:54) DeepMind Coders Of The World, Unite
    (02:45) Report Your Incidents
    (03:04) Good Guy With An AI
    (04:49) Free As In To Give It A Shot
    (08:21) Everything Is Both Speech And Computer
    (11:10) Lambs To The Slaughter
    (15:43) A Sign Saying Beware Of The Leopard
    (16:52) The Once And Present Mythos
    (20:26) What Is To Be Done
    (24:00) Distillation
    (26:24) What Would Banning Open Source Even Mean
    (27:20) Open Weight Models Are Unsafe And Nothing Can Fix This
    ---

    First published:

    June 30th, 2026


    Source:

    https://www.lesswrong.com/posts/phxgfwGNGbanumMMv/the-once-and-future-fable-5

    ---

    Narrated by TYPE III AUDIO.

    ---
    Images from the article:
    Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
  • LessWrong posts by zvi

    “WSJ Article Claiming China Has Matched Anthropic Is Obvious Nonsense” by Zvi

    29/06/2026 | 9 mins.
    The Wall Street Journal printed an outright false headline and heavily misleading story claiming this, which of course was uncritically amplified by the usual suspects.

    I post this now on its own so that we have a place to link to, to explain the situation.

    Headline News

    WSJ Headline (Obvious Nonsense): ​China Has Matched Anthropic in Cybersecurity, Resetting AI Race.

    That. Did. Not. Happen.

    The post even claims, explicitly, that Claude Opus 4.8 similarly ‘matches’ Claude Mythos, a claim which is even more obviously false.

    Shame upon the Wall Street Journal. I fear Gell-Mann Amnesia. If they can get something as important as this so completely wrong, what about everything else?

    I am skipping over the parts that involve accurate reporting, or minor quibbles.

    It seems important to focus on clearly debunking the central false claims.

    Alas, the mistakes made here very much rhyme with mistakes being made throughout all this by the White House, and that get latched onto by certain bad actors, who have played a large part in leaving us unprepared for the Mythos Moment.

    For a full understanding of GLM-5.2, which is indeed an impressive [...]
    ---
    Outline:
    (00:27) Headline News
    (02:09) What Makes Mythos Special
    (03:16) Going Over The Detailed Claims
    (07:38) One Helpful Note
    (08:18) The Overall Impression Is Extremely Wrong
    (08:48) All Of This Has Happened Before And Will Happen Again
    ---

    First published:

    June 29th, 2026


    Source:

    https://www.lesswrong.com/posts/bpBYm5jiS4tpyzuDS/wsj-article-claiming-china-has-matched-anthropic-is-obvious

    ---

    Narrated by TYPE III AUDIO.

    ---
    Images from the article:
    Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
  • LessWrong posts by zvi

    “GPT-5.6: The System Card” by Zvi

    28/06/2026 | 53 mins.
    While we wait for a general release, the system card is the best hint as to what is going on with the new candidate for America's Next Top Model, GPT-5.6.

    This is only an OpenAI model card, so by my standards it's a light read. There's a lot of things that you get in an Anthropic card, that are missing in an OpenAI card.

    Overall, the card gives a clear and consistent impression that GPT-5.6-Sol is a substantial improvement over GPT-5.5, but still short of Mythos.

    OpenAI calls it a ‘step function better’ than GPT-5.5. That seems accurate.

    OpenAI: Sol is our new flagship and a step function better than GPT-5.5.

    Terra delivers performance competitive to GPT-5.5 at 2x lower cost.

    Luna is our most cost-efficient model, delivering strong capability at our lowest cost.

    Together, the GPT-5.6 family gives people and developers more choice in how they balance intelligence, speed, and cost.

    Once available, pricing for GPT-5.6-Sol will be $5/$30, the same as GPT-5.5. Terra is $2.5/$15, Luna is $1/$6.

    They claim it will be on Cerebras at 750 TPS, which is insanely fast. Capacity will be limited, at least at first. [...]

    ---
    Outline:
    (03:49) What's In A Name?
    (04:26) Fix This Code
    (07:08) Crossover Event Requested
    (07:43) Disallowed Content (3)
    (09:03) Avoiding Accidental Data-Destructive Actions (3.3)
    (09:29) Are You Sure? (3.4)
    (09:58) Jailbreaks (4.1)
    (10:14) Prompt Injection (4.2)
    (10:40) HealthBench (5.1)
    (11:00) Dynamic Mental Health Adversarial User Simulations (5.2)
    (12:21) Hallucinations (6)
    (12:50) Isolated Misaligned Actions (7.1)
    (13:10) Going Overboard (7.2)
    (18:11) Chain of Thought Evaluations (7.3)
    (19:18) Bias (8)
    (19:27) Preparedness (9)
    (20:15) Biological Risks (9.1.1)
    (22:15) Cybersecurity (9.1.2)
    (28:40) External Cyber Evaluation FrontierCyber from Irregular (9.1.2.5)
    (30:32) Cyber Conclusions
    (31:07) Recursive Self-Improvement (9.1.3)
    (32:22) METR Warns Us (9.1.3.6)
    (35:04) Everything Is Under Control
    (37:44) Metagaming (7.4)
    (40:17) Apollo Research and Sandbagging
    (43:09) Safeguards (9.3)
    (50:01) Better Not Call Sol Yet
    The original text contained 2 footnotes which were omitted from this narration.
    ---

    First published:

    June 28th, 2026


    Source:

    https://www.lesswrong.com/posts/JFjNmPTbH8kL6xtp6/gpt-5-6-the-system-card

    ---

    Narrated by TYPE III AUDIO.

    ---
    Images from the article:
    Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
More Philosophy podcasts
About LessWrong posts by zvi
Audio narrations of LessWrong posts by zvi
Podcast website

Listen to LessWrong posts by zvi, Dear Hank & John and many other podcasts from around the world with the radio.net app

Get the free radio.net app

  • Stations and podcasts to bookmark
  • Stream via Wi-Fi or Bluetooth
  • Supports Carplay & Android Auto
  • Many other app features
LessWrong posts by zvi: Podcasts in Family