PodcastsPhilosophyLessWrong posts by zvi

LessWrong posts by zvi

zvi
LessWrong posts by zvi
Latest episode

479 episodes

  • LessWrong posts by zvi

    โ€œPolitical Violence Is Never Acceptableโ€ by Zvi

    13/04/2026 | 36 mins.
    Nor is the threat or implication of violence. Period. Ever. No exceptions.

    It is completely unacceptable. I condemn it in the strongest possible terms.

    It is immoral, and also it is ineffective. It would be immoral even if it were effective. Nothing hurts your cause more.

    Do not do this, and do not tolerate anyone who does.

    The reason I need to say this now is that there has been at least one attempt at violence, and potentially two in quick succession, against OpenAI CEO Sam Altman.

    My sympathies go out to him and I hope he is doing as okay as one could hope for.

    Awful Events Amid Scary Times

    Max Zeff: NEW: A suspect was arrested on Friday morning for allegedly throwing a Molotov cocktail at OpenAI CEO Sam Altman's home. A person matching the suspect's description was later seen making threats outside of OpenAI's corporate HQ.

    Nathan Calvin: This is beyond disturbing and awful. Whatever disagreements you have with Sam or OpenAI, this cannot be normalized or justified in any way. Everyone deserves to be able to be safe with their families at home. I feel ill and [...]
    ---
    Outline:
    (00:51) Awful Events Amid Scary Times
    (04:51) Most Of Those Worried About AI Do As Well As One Can On This
    (06:54) Some Who Are Worried About AI Need To Address Their Rhetoric
    (11:49) Speak The Truth Even If Your Voice Trembles
    (14:02) False Accusations And False Attacks Are Also Unacceptable
    (15:35) Some Examples Of Attempts To Create Broad Censorship
    (24:53) The Most Irresponsible Reaction Was From The Press
    (25:50) Sam Altman Reacts
    (28:21) Sam Altman Reflects
    (33:40) Violence Is Never The Answer
    ---

    First published:

    April 13th, 2026


    Source:

    https://www.lesswrong.com/posts/dsaEB4u2dxp9BdhdS/political-violence-is-never-acceptable

    ---

    Narrated by TYPE III AUDIO.
  • LessWrong posts by zvi

    โ€œClaude Mythos #2: Cybersecurity and Project Glasswingโ€ by Zvi

    10/04/2026 | 1h 9 mins.
    Anthropic is not going to release its new most capable model, Claude Mythos, to the public any time soon. Its cyber capabilities are too dangerous to make broadly available until our most important software is in a much stronger state and there are no plans to release Mythos widely.

    They are instead going to do a limited release to key cybersecurity partners, in order to use it to patch as many vulnerabilities as possible in our most important software.

    Yes, this is really happening. Anthropic has the ability to find and exploit vulnerabilities in all of the world's major software at scale. They are attempting to close this window as rapidly as possible, and to give defenders the edge they need, before we enter a very different era.

    Yes, this was necessary, and I am very happy that, given the capabilities involved exist, things are playing out the way that they are. All alternatives were vastly worse.

    We are entering a new era. It will start with a scramble to secure our key systems.

    Yesterday I covered the model card for Mythos. Today is about cybersecurity.

    The New York Times reported on this [...]
    ---
    Outline:
    (02:08) Introducing Project Glasswing
    (03:31) Dont Worry About the Government
    (05:02) Cybersecurity Capabilities In The Model Card (Section 3)
    (06:41) Cyber Capability Tests In The Model Card
    (08:11) The Proof Is In The Patching
    (10:28) Go For Read Team
    (14:04) Is This New?
    (16:38) Thanks For The Memories
    (21:21) How Good Is Mythos At This?
    (24:24) What Might Have Been
    (27:09) The Chaos Option
    (30:15) The Cant Happen That Happened
    (31:23) When You Go Looking For Specific, And You Are Told Exactly Where and How To Look For It, Your Chances Of Finding It Are Very Good
    (36:55) Blatant Denials Are The Best Kind
    (40:48) Anything You Can Do I Can Do Cheaper
    (43:14) Theft Of Mythos Would Be A Big Deal
    (43:43) No One Could Have Predicted This
    (44:34) The Revolution Will Not Be Televised
    (45:33) The Intelligence Will Not Be Televised
    (47:43) Will We Be Doing This For A While?
    (49:53) What If OpenAI Gets a Similar Model?
    (51:17) Use It Or Lose It
    (51:59) Solve For The Equilibrium
    (55:09) Patriots and Tyrants
    (57:26) Trust The Mythos
    (59:03) Wide Scale Ability To Exploit Software Favors Strongest Projects
    (01:03:58) Looking Back at GPT-2
    (01:05:18) Limitless Demand For Compute
    (01:07:07) Oh, Also, If Anyone Builds It, Everyone Dies
    ---

    First published:

    April 10th, 2026


    Source:

    https://www.lesswrong.com/posts/GEgNYn5myreQRHggQ/claude-mythos-2-cybersecurity-and-project-glasswing

    ---

    Narrated by TYPE III AUDIO.

    ---
    Images from the article:
    8 out of 8 [cheap oss] models detected Mythos's flagship FreeBSD exploit Completely disingenuous"." style="max-width: 100%;" />Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
  • LessWrong posts by zvi

    โ€œClaude Mythos: The System Cardโ€ by Zvi

    09/04/2026 | 1h 46 mins.
    Claude Mythos is different.

    This is the first model other than GPT-2 that is at first not being released for public use at all.

    With GPT-2 the delay was due to a general precautionary principle. OpenAI did not know what they had, or what effect on demand text would have on various systems. It sounds funny now, GPT-2 was harmless, but at the time the concern was highly reasonable.

    The decision not to release Claude Mythos is not about an amorphous fear. If given to anyone with a credit card, Claude Mythos would give attackers a cornucopia of zero-day exploits for essentially all the software on Earth, including every major operating system and browser. It would be chaos.

    Or, in theory, if Anthropic had chosen to do so, it could have used those exploits. Great power was on offer, and that power was refused. This does not happen often.

    Instead Anthropic has created Project Glasswing. Mythos is being given only to cybersecurity firms, so they can patch the world's most important software. Based on how that goes, we can then decide if and when it will become reasonable to give access to a broader [...]
    ---
    Outline:
    (03:24) Mundane Alignment Is Excellent
    (05:01) Would This Process Be Sufficient To Find A Dangerous Model?
    (06:27) Introductory Warning About Superficial Mundane Alignment
    (15:12) Model Training (1.1)
    (15:25) Release Decision Process (1.2)
    (17:50) RSP Evaluations (2.1 and 2.2)
    (22:17) Autonomy Evaluations (2.3)
    (25:56) The Alignment Risk Update Document
    (26:39) The Threat Model
    (29:18) Misalignment As Failure Mode
    (31:35) Wouldnt You Know?
    (33:40) Dont Encourage Your Model
    (35:14) Beware Goodharts Law
    (37:18) Beware The Most Forbidden Technique (5.2.3)
    (41:44) Asking The Right Questions
    (43:11) Model Organism Tests
    (45:01) Model Weight Security (Risk Report 5.5.2.1)
    (45:31) Reward Hacking (Back to The Model Card)
    (45:56) Remote Drop-In Worker Coming Soon
    (49:01) External Testing (2.3.7)
    (49:37) Cyber Insecurity General Principle Interlude
    (50:46) Alignment (4)
    (56:38) Risk In The Room
    (57:56) Mythos Meant Well
    (01:00:20) Risk Not In The Room
    (01:02:05) Alignment Testing Overview
    (01:05:20) Internal Deployment Testing Process
    (01:07:55) Reports From Pilot Use (4.2.1)
    (01:08:30) Reports From Automated Testing (4.2)
    (01:10:13) Other External Testing
    (01:10:56) Just The Facts, Sir
    (01:13:05) Refusing Safety Research
    (01:14:12) Claude Favoritism
    (01:15:19) Ruling Out Encoded Thinking (4.4.1)
    (01:18:41) Sandbagging (4.4.2)
    (01:21:27) Capability for Evasion of Safeguards (4.4.3)
    (01:23:04) Pick A Random Number (4.4.3.4)
    (01:25:49) White Box Analysis (4.5)
    (01:30:30) Model Welfare (5)
    (01:31:32) Key Model Welfare Findings (5.1.2)
    (01:41:17) Is Mythos Okay?
    (01:43:52) Self-Play
    (01:45:30) A Few Fun Facts
    ---

    First published:

    April 9th, 2026


    Source:

    https://www.lesswrong.com/posts/EDQhwLTyTnNmaxRGq/claude-mythos-the-system-card

    ---

    Narrated by TYPE III AUDIO.

    ---
    Images from the article:
    Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
  • LessWrong posts by zvi

    โ€œAI #163: Mythos Questโ€ by Zvi

    08/04/2026 | 1h 24 mins.
    There exists an AI model, Claude Mythos, that has discovered critical safety vulnerabilities in every major operating system and browser. If released today it would likely break the internet and be chaos. If they had wanted to, they could have used it themselves and owned pretty much everyone.

    Luckily for all of us, Anthropic did no such thing. Instead, Anthropic is launching Project Glasswing, and making Mythos available to cybersecurity companies, so everyone can patch all the world's critical software as quickly as possible, and then we can figure out what to do from there.

    That's the story in AI that matters this week, and it is where my focus will be until Iโ€™ve worked my way through it all. But as always, that takes time to do right. So instead, Iโ€™m getting the weekly, and coverage of everything else, out of the way a day early. This post is about the non-Mythos landscape, and I hope to start covering Mythos and Project Glasswing tomorrow.

    I also covered the latest extended (18k words!) article about the history of Sam Altman and OpenAI, which contained some new material while confirming much old material, and analyzed their recent [...]
    ---
    Outline:
    (02:17) Language Models Offer Mundane Utility
    (02:48) Language Models Dont Offer Mundane Utility
    (03:11) Huh, Upgrades
    (04:24) On Your Marks
    (06:55) Meta Problems
    (07:15) Fun With Media Generation
    (09:13) A Young Ladys Illustrated Primer
    (09:22) You Drive Me Crazy
    (22:05) Unprompted Attention
    (22:46) They Took Our Jobs
    (33:27) They Took Our Job Market
    (35:29) Get Involved
    (37:31) In Other AI News
    (38:08) Search Your Feelings You Know It To Be True
    (45:58) Actors And Scribes
    (49:06) Show Me the Money
    (53:46) Bubble, Bubble, Toil and Trouble
    (54:05) Quiet Speculations
    (54:20) Quickly, Theres No Time
    (58:02) More Time Would Be Better
    (58:55) Greetings From The Department of War
    (01:00:11) The Quest for Sane Regulations
    (01:01:57) Chip City
    (01:03:29) Political Violence Is Completely and Always Unacceptable
    (01:04:16) The Week in Audio
    (01:06:42) Rhetorical Innovation
    (01:10:53) People Really Hate AI
    (01:13:39) Aligning a Smarter Than Human Intelligence is Difficult
    (01:17:44) Messages From Janusworld
    (01:21:00) People Are Worried About AI Killing Everyone
    (01:21:50) The Lighter Side
    ---

    First published:

    April 8th, 2026


    Source:

    https://www.lesswrong.com/posts/5Dsuw9gGzkbjS4ubx/ai-163-mythos-quest

    ---

    Narrated by TYPE III AUDIO.

    ---
    Images from the article:
    Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
  • LessWrong posts by zvi

    โ€œOpenAI #16: A History and a Proposalโ€ by Zvi

    07/04/2026 | 45 mins.
    The real news today is that Anthropic has partnered with the top companies in cybersecurity to try and patch everyone's systems to fix all the thousands of zero-day exploits found by their new model Claude Mythos.

    Iโ€™ll be sorting through that over the coming days. For now, we instead have stories from OpenAI.

    In particular there are three stories.

    There's a massive 18,000 word article in The New Yorker about Sam Altman and the history of OpenAI as it relates to his trustworthiness. No trust.

    There's also OpenAI's proposal for a โ€˜new dealโ€™ of sorts. No deal.

    Then there is an actual deal, where they bought TBPN. RIP.

    Table of Contents


    Part 1: OpenAI: The Histories.

    The Battle of the Board.

    Thanks For The Memos.

    I Am What I Am.

    That's Not What I Said.

    There Will Be No Investigation.

    Musk Versus Altman.

    Amodei Versus Altman.

    Sydney Versus Altman.

    Highest Bidder Versus Altman.

    Risky Business.

    Superalignment Was Always Fake.

    This Is Fine.

    Liar Liar Master Persuader.

    This In Particular Is Securities Fraud.

    Regulation Two Step.

    [...]
    ---
    Outline:
    (00:54) Part 1: OpenAI: The Histories
    (02:11) The Battle of the Board
    (03:17) Thanks For The Memos
    (03:39) I Am What I Am
    (04:21) Thats Not What I Said
    (04:37) There Will Be No Investigation
    (05:41) Musk Versus Altman
    (06:54) Amodei Versus Altman
    (08:47) Sydney Versus Altman
    (09:43) Highest Bidder Versus Altman
    (12:07) Risky Business
    (14:42) Superalignment Was Always Fake
    (17:18) This Is Fine
    (18:12) Liar Liar Master Persuader
    (22:01) This In Particular Is Securities Fraud
    (23:43) Regulation Two Step
    (25:11) Easy Mode
    (27:48) The Right Amount of Alignment Research Is Not Zero
    (29:54) OpenAI Proposes Policy
    (41:46) RIP TBPN
    ---

    First published:

    April 7th, 2026


    Source:

    https://www.lesswrong.com/posts/QSgBhcDKi9j5iSi9s/openai-16-a-history-and-a-proposal

    ---

    Narrated by TYPE III AUDIO.

More Philosophy podcasts

About LessWrong posts by zvi

Audio narrations of LessWrong posts by zvi
Podcast website

Listen to LessWrong posts by zvi, The Shawn Ryan Show and many other podcasts from around the world with the radio.net app

Get the free radio.net app

  • Stations and podcasts to bookmark
  • Stream via Wi-Fi or Bluetooth
  • Supports Carplay & Android Auto
  • Many other app features

LessWrong posts by zvi: Podcasts in Family

Social
v8.8.9| ยฉ 2007-2026 radio.de GmbH
Generated: 4/14/2026 - 12:45:33 PM