I’ve been trying to find a slot for this one for a while. I am thrilled that today had sufficiently little news that I am comfortable posting this.
Gemini 3.1 scores very well on benchmarks, but most of us had the same reaction after briefly trying it: “It's a Gemini model.”
And that was that, given our alternatives. But it's got its charms.
Consider this a nice little, highly skippable break.
The Pitch
It's a good model, sir. That's the pitch.
Sundar Pichai (CEO Google): Gemini 3.1 Pro is here. Hitting 77.1% on ARC-AGI-2, it's a step forward in core reasoning (more than 2x 3 Pro).
With a more capable baseline, it's great for super complex tasks like visualizing difficult concepts, synthesizing data into a single view, or bringing creative projects to life.
We’re shipping 3.1 Pro across our consumer and developer products to bring this underlying leap in intelligence to your everyday applications right away.
Jeff Dean also highlighted ARC-AGI-2 along with some cool animations, an urban planning sim, some heat transfer analysis and the general benchmarks.
On Your Marks
Google presents a good standard set of [...]
---
Outline:
(00:37) The Pitch
(01:31) On Your Marks
(04:34) Other Peoples Benchmarks
(06:54) Gemini 3 DeepThink V2
(12:33) Positive Feedback
(17:22) Negative Feedback
(19:07) Try Gemini Lite
---
First published:
March 4th, 2026
Source:
https://www.lesswrong.com/posts/82zizPyyPgaEswbxz/gemini-3-1-pro-aces-benchmarks-i-suppose
---
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.