Powered by RND
PodcastsSociety & CultureLessWrong (Curated & Popular)

LessWrong (Curated & Popular)

LessWrong
LessWrong (Curated & Popular)
Latest episode

Available Episodes

5 of 589
  • “I am worried about near-term non-LLM AI developments” by testingthewaters
    TL;DR I believe that: Almost all LLM-centric safety research will not provide any significant safety value with regards to existential or civilisation-scale risks. The capabilities-related forecasts (not the safety-related forecasts) of Stephen Brynes' Foom and Doom articles are correct, except that they are too conservative with regards to timelines. There exists a parallel track of AI research which has been largely ignored by the AI safety community.  This agenda aims to implement human-like online learning in ML models, and it is now close to maturity. Keywords: Hierarchical Reasoning Model, Energy-based Model, Test time training. Within 6 months this line of research will produce a small natural-language capable model that will perform at the level of a model like GPT-3, but with improved persistence and effectively no "context limit" since it is constantly learning and updating weights. Further development of this research will produce models that fulfill most of [...] ---Outline:(00:11) TL;DR(01:22) Overview(04:10) The Agenda I am Worried About(07:36) Concrete Predictions(09:29) What I think we should do--- First published: July 31st, 2025 Source: https://www.lesswrong.com/posts/tEZa7PouYatK78bbb/i-am-worried-about-near-term-non-llm-ai-developments --- Narrated by TYPE III AUDIO.
    --------  
    10:54
  • “Optimizing The Final Output Can Obfuscate CoT (Research Note)” by lukemarks, jacob_drori, cloud, TurnTrout
    Produced as part of MATS 8.0 under the mentorship of Alex Turner and Alex Cloud. This research note overviews some early results which we are looking for feedback on. TL;DR: We train language models with RL in toy environments. We show that penalizing some property of the output is sufficient to suppress that property in the chain of thought also, even when that property is relevant to task completion. For example, when we penalize a model for mentioning in its output that it completed a task via a certain form of cheating, its reasoning also omits this fact. We therefore suspect that training against an output monitor is sufficient to cause obfuscated CoTs in more realistic settings. Introduction Chain of thought (CoT) supervision appears in many control and scalable oversight protocols. It has been argued that being able to monitor CoTs for unwanted behavior is a critical property [...] ---Outline:(00:56) Introduction(02:38) Setup(03:48) Single-Turn Setting(04:26) Multi-Turn Setting(06:51) Results(06:54) Single-Turn Setting(08:21) Multi-Turn Terminal-Based Setting(08:25) Word-Usage Penalty(09:12) LLM Judge Penalty(10:12) Takeaways(10:57) AcknowledgementsThe original text contained 1 footnote which was omitted from this narration. --- First published: July 30th, 2025 Source: https://www.lesswrong.com/posts/CM7AsQoBxDW4vhkP3/optimizing-the-final-output-can-obfuscate-cot-research-note --- Narrated by TYPE III AUDIO. ---Images from the article:
    --------  
    11:30
  • “About 30% of Humanity’s Last Exam chemistry/biology answers are likely wrong” by bohaska
    FutureHouse is a company that builds literature research agents. They tested it on the bio + chem subset of HLE questions, then noticed errors in them. The post's first paragraph: Humanity's Last Exam has become the most prominent eval representing PhD-level research. We found the questions puzzling and investigated with a team of experts in biology and chemistry to evaluate the answer-reasoning pairs in Humanity's Last Exam. We found that 29 ± 3.7% (95% CI) of the text-only chemistry and biology questions had answers with directly conflicting evidence in peer reviewed literature. We believe this arose from the incentive used to build the benchmark. Based on human experts and our own research tools, we have created an HLE Bio/Chem Gold, a subset of AI and human validated questions. About the initial review process for HLE questions: [...] Reviewers were given explicit instructions: “Questions should ask for something precise [...] --- First published: July 29th, 2025 Source: https://www.lesswrong.com/posts/JANqfGrMyBgcKtGgK/about-30-of-humanity-s-last-exam-chemistry-biology-answers --- Narrated by TYPE III AUDIO.
    --------  
    6:40
  • “Maya’s Escape” by Bridgett Kay
    Maya did not believe she lived in a simulation. She knew that her continued hope that she could escape from the nonexistent simulation was based on motivated reasoning. She said this to herself in the front of her mind instead of keeping the thought locked away in the dark corners. Sometimes she even said it out loud. This acknowledgement, she explained to her therapist, was what kept her from being delusional. “I see. And you said your anxiety had become depressive?” the therapist said absently, clicking her pen while staring down at an empty clipboard. “No- I said my fear had turned into despair,” Maya corrected. It was amazing, Maya thought, how many times the therapist had refused to talk about simulation theory. Maya had brought it up three times in the last hour, and each time, the therapist had changed the subject. Maya wasn’t surprised; this [...] --- First published: July 27th, 2025 Source: https://www.lesswrong.com/posts/ydsrFDwdq7kxbxvxc/maya-s-escape --- Narrated by TYPE III AUDIO.
    --------  
    20:24
  • “Do confident short timelines make sense?” by TsviBT, abramdemski
    TsviBT Tsvi's context Some context: My personal context is that I care about decreasing existential risk, and I think that the broad distribution of efforts put forward by X-deriskers fairly strongly overemphasizes plans that help if AGI is coming in <10 years, at the expense of plans that help if AGI takes longer. So I want to argue that AGI isn't extremely likely to come in <10 years. I've argued against some intuitions behind AGI-soon in Views on when AGI comes and on strategy to reduce existential risk. Abram, IIUC, largely agrees with the picture painted in AI 2027: https://ai-2027.com/ Abram and I have discussed this occasionally, and recently recorded a video call. I messed up my recording, sorry--so the last third of the conversation is cut off, and the beginning is cut off. Here's a link to the first point at which [...] ---Outline:(00:17) Tsvis context(06:52) Background Context:(08:13) A Naive Argument:(08:33) Argument 1(10:43) Why continued progress seems probable to me anyway:(13:37) The Deductive Closure:(14:32) The Inductive Closure:(15:43) Fundamental Limits of LLMs?(19:25) The Whack-A-Mole Argument(23:15) Generalization, Size, & Training(26:42) Creativity & Originariness(32:07) Some responses(33:15) Automating AGI research(35:03) Whence confidence?(36:35) Other points(48:29) Timeline Split?(52:48) Line Go Up?(01:15:16) Some Responses(01:15:27) Memers gonna meme(01:15:44) Right paradigm? Wrong question.(01:18:14) The timescale characters of bioevolutionary design vs. DL research(01:20:33) AGI LP25(01:21:31) come on people, its \[Current Paradigm\] and we still dont have AGI??(01:23:19) Rapid disemhorsepowerment(01:25:41) Miscellaneous responses(01:28:55) Big and hard(01:31:03) Intermission(01:31:19) Remarks on gippity thinkity(01:40:24) Assorted replies as I read:(01:40:28) Paradigm(01:41:33) Bio-evo vs DL(01:42:18) AGI LP25(01:46:30) Rapid disemhorsepowerment(01:47:08) Miscellaneous(01:48:42) Magenta Frontier(01:54:16) Considered Reply(01:54:38) Point of Departure(02:00:25) Tsvis closing remarks(02:04:16) Abrams Closing Thoughts--- First published: July 15th, 2025 Source: https://www.lesswrong.com/posts/5tqFT3bcTekvico4d/do-confident-short-timelines-make-sense --- Narrated by TYPE III AUDIO. ---Images from the article:
    --------  
    2:10:59

More Society & Culture podcasts

About LessWrong (Curated & Popular)

Audio narrations of LessWrong posts. Includes all curated posts and all posts with 125+ karma.If you'd like more, subscribe to the “Lesswrong (30+ karma)” feed.
Podcast website

Listen to LessWrong (Curated & Popular), The Turning - Seasons 1, 2 & 3 and many other podcasts from around the world with the radio.net app

Get the free radio.net app

  • Stations and podcasts to bookmark
  • Stream via Wi-Fi or Bluetooth
  • Supports Carplay & Android Auto
  • Many other app features

LessWrong (Curated & Popular): Podcasts in Family

Social
v7.22.0 | © 2007-2025 radio.de GmbH
Generated: 8/2/2025 - 2:05:57 PM