“On the Meta and DeepMind Safety Frameworks” by Zvi
This week we got a revision of DeepMind's safety framework, and the first version of Meta's framework. This post covers both of them.
Table of Contents
Meta's RSP (Frontier AI Framework).
DeepMind Updates its Frontier Safety Framework.
What About Risk Governance.
Where Do We Go From Here?
Here are links for previous coverage of: DeepMind's Framework 1.0, OpenAI's Framework and Anthropic's Framework.
Meta's RSP (Frontier AI Framework)
Since there is a law saying no two companies can call these documents by the same name, Meta is here to offer us its Frontier AI Framework, explaining how Meta is going to keep us safe while deploying frontier AI systems.
I will say up front, if it sounds like I’m not giving Meta the benefit of the doubt here, it's because I am absolutely not giving Meta the benefit of [...] ---Outline:(00:14) Meta's RSP (Frontier AI Framework)(16:10) DeepMind Updates its Frontier Safety Framework(31:05) What About Risk Governance(33:42) Where Do We Go From Here?The original text contained 12 images which were described by AI. ---
First published:
February 7th, 2025
Source:
https://www.lesswrong.com/posts/etqbEF4yWoGBEaPro/on-the-meta-and-deepmind-safety-frameworks
---
Narrated by TYPE III AUDIO.
---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
--------
35:05
“AI #102: Made in America” by Zvi
I remember that week I used r1 a lot, and everyone was obsessed with DeepSeek.
They earned it. DeepSeek cooked, r1 is an excellent model. Seeing the Chain of Thought was revolutionary. We all learned a lot.
It's still #1 in the app store, there are still hysterical misinformed NYT op-eds and and calls for insane reactions in all directions and plenty of jingoism to go around, largely based on that highly misleading $6 millon cost number for DeepSeek's v3, and a misunderstanding of how AI capability curves move over time.
But like the tariff threats that's now so yesterday now, for those of us that live in the unevenly distributed future.
All my reasoning model needs go through o3-mini-high, and Google's fully unleashed Flash Thinking for free. Everyone is exploring OpenAI's Deep Research, even in its early form, and I finally have an entity [...] ---Outline:(01:15) Language Models Offer Mundane Utility(07:23) o1-Pro Offers Mundane Utility(10:35) We're in Deep Research(17:08) Language Models Don't Offer Mundane Utility(17:49) Model Decision Tree(20:43) Huh, Upgrades(21:57) Bot Versus Bot(24:04) The OpenAI Unintended Guidelines(26:40) Peter Wildeford on DeepSeek(29:18) Our Price Cheap(35:25) Otherwise Seeking Deeply(44:13) Smooth Operator(46:46) Have You Tried Not Building An Agent?(51:58) Deepfaketown and Botpocalypse Soon(54:56) They Took Our Jobs(01:08:29) The Art of the Jailbreak(01:08:56) Get Involved(01:13:05) Introducing(01:13:45) In Other AI News(01:16:37) Theory of the Firm(01:21:32) Quiet Speculations(01:24:36) The Quest for Sane Regulations(01:33:33) The Week in Audio(01:34:41) Rhetorical Innovation(01:38:22) Aligning a Smarter Than Human Intelligence is Difficult(01:40:33) The Alignment Faking Analysis Continues(01:44:24) Masayoshi Son Follows Own Advice(01:48:22) People Are Worried About AI Killing Everyone(01:50:32) You Are Not Ready(02:00:45) Other People Are Not As Worried About AI Killing Everyone(02:02:53) The Lighter SideThe original text contained 22 images which were described by AI. ---
First published:
February 6th, 2025
Source:
https://www.lesswrong.com/posts/rAaGbh7w52soCckNC/ai-102-made-in-america
---
Narrated by TYPE III AUDIO.
---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
--------
2:07:31
“The Risk of Gradual Disempowerment from AI” by Zvi
The baseline scenario as AI becomes AGI becomes ASI (artificial superintelligence), if nothing more dramatic goes wrong first and even we successfully ‘solve alignment’ of AI to a given user and developer, is the ‘gradual’ disempowerment of humanity by AIs, as we voluntarily grant them more and more power in a vicious cycle, after which AIs control the future and an ever-increasing share of its real resources. It is unlikely that humans survive it for long.
This gradual disempowerment is far from the only way things could go horribly wrong. There are various other ways things could go horribly wrong earlier, faster and more dramatically, especially if we indeed fail at alignment of ASI on the first try.
Gradual disempowerment it still is a major part of the problem, including in worlds that would otherwise have survived those other threats. And I don’t know of any good [...] ---Outline:(01:15) We Finally Have a Good Paper(02:30) The Phase 2 Problem(05:02) Coordination is Hard(07:59) Even Successful Technical Solutions Do Not Solve This(08:58) The Six Core Claims(14:35) Proposed Mitigations Are Insufficient(19:58) The Social Contract Will Change(21:07) Point of No Return(22:51) A Shorter Summary(24:13) Tyler Cowen Seems To Misunderstand Two Key Points(25:53) Do You Feel in Charge?(28:04) We Will Not By Default Meaningfully 'Own' the AIs For Long(29:53) Collusion Has Nothing to Do With This(32:38) If Humans Do Not Successfully Collude They Lose All Control(34:45) The Odds Are Against Us and the Situation is Grim---
First published:
February 5th, 2025
Source:
https://www.lesswrong.com/posts/jEZpfsdaX2dBD9Y6g/the-risk-of-gradual-disempowerment-from-ai
---
Narrated by TYPE III AUDIO.
--------
37:25
“We’re in Deep Research” by Zvi
The latest addition to OpenAI's Pro offerings is their version of Deep Research.
Have you longed for 10k word reports on anything your heart desires, 100 times a month, at a level similar to a graduate student intern? We have the product for you.
Table of Contents
The Pitch.
It's Coming.
Is It Safe?.
How Does Deep Research Work?.
Killer Shopping App.
Rave Reviews.
Research Reports.
Perfecting the Prompt.
Not So Fast!.
What's Next?.
Paying the Five.
The Lighter Side.
The Pitch
OpenAI: Today we’re launching deep research in ChatGPT, a new agentic capability that conducts multi-step research on the internet for complex tasks. It accomplishes in tens of minutes what would take a human many hours.
Sam Altman: Today we launch Deep Research, our next agent.
This is like a superpower; experts on [...] ---Outline:(00:20) The Pitch(03:12) It's Coming(05:01) Is It Safe?(09:49) How Does Deep Research Work?(10:47) Killer Shopping App(12:17) Rave Reviews(18:33) Research Reports(31:21) Perfecting the Prompt(32:26) Not So Fast!(35:46) What's Next?(36:59) Paying the Five(37:59) The Lighter SideThe original text contained 4 images which were described by AI. ---
First published:
February 4th, 2025
Source:
https://www.lesswrong.com/posts/QqSxKRKJupjuDkymQ/we-re-in-deep-research
---
Narrated by TYPE III AUDIO.
---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
--------
39:20
“o3-mini Early Days” by Zvi
New model, new hype cycle, who dis?
On a Friday afternoon, OpenAI was proud to announce the new model o3-mini and also o3-mini-high which is somewhat less mini, or for some other reasoning tasks you might still want o1 if you want a broader knowledge base, or if you’re a pro user o1-pro, while we want for o3-not-mini and o3-pro, except o3 can use web search and o1 can’t so it has the better knowledge in that sense, then on a Sunday night they launched Deep Research which is different from Google's Deep Research but you only have a few of those queries so make them count, or maybe you want to use operator?
Get it? Got it? Good.
Yes, Pliny jailbroke o3-mini on the spot, as he always does.
This most mostly skips over OpenAI's Deep Research (o3-DR? OAI-DR?). I need more time for [...] ---Outline:(01:16) Feature Presentation(04:37) QandA(09:14) The Wrong Side of History(13:29) The System Card(22:08) The Official Benchmarks(24:55) The Unofficial Benchmarks(27:43) Others Report In(29:47) Some People Need Practical AdviceThe original text contained 10 images which were described by AI. ---
First published:
February 3rd, 2025
Source:
https://www.lesswrong.com/posts/srdxEAcdmetdAiGcz/o3-mini-early-days
---
Narrated by TYPE III AUDIO.
---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.