Skyrocketing AI Intelligence: ChatGPT's o3…

Maxim Lott

Apr 19

The world will be changing soon

Read →

20 Comments

Greg

Apr 19

The God Machine is on its way

Expand full comment

Reply (1)

Devadatta

Apr 19

Better not overfeed it with orichalcum. Remember Plato's tenfold error!

Expand full comment

Nick Q.

Apr 22

Really appreciate this kind of longitudinal benchmarking—especially when the same test is repeated over time. That consistency is rare and valuable.

That said, it’s increasingly important to ask what we’re actually measuring as models start outperforming humans on IQ tests.

We’re still rewarding fluency under constraint—token-level pattern extension—not cognitive flexibility or causal reasoning. Rising scores may reflect emergent capabilities, but they also reveal growing prompt sensitivity and structural scaffolding around task framing.

I’ve been exploring a modular prompt framework (Lexome) to help separate prompt design from true model capability. Benchmarks like these are great, but we also need tests that hold structure constant to see what’s really improving.

Would love to hear from others thinking about uncertainty-aware evaluation or cognitive framing.

Expand full comment

Martin Greenwald, M.D.

Apr 19

There's a medical reasoning case I came up with and have been giving to subsequent GPTs (and Claude and Gemini once in a while) to see how they develop in their reasoning abilities. o3 is clearly a big deal and blew the others out of the park. We're in for some interesting times.

Expand full comment

Russell Huang

Apr 30

Hi Maxim, another question - have you (further) updated your personal p(doom) recently?

Expand full comment

Reply (1)

Comment removed

Jun 14Edited

Expand full comment

Reply (1)

Nick Q.

Jun 14

Hey @maximumtruth

Guessing this isn't you, right?

Expand full comment

Reply (1)

Maxim Lott

Jun 14

Spam, for sure. Thanks. Deleted.

Expand full comment

Reply (1)

Nick Q.

Jun 14

Oh good!

Glad you spiked them!

Expand full comment

Bob Rodrigues

Apr 19

I enjoy your site. One comment on its design : change the color of text now displayed in light yellow which makes vizualizatin difficult in most devices.

Expand full comment

Reply (1)

Maxim Lott

Apr 20

Thanks. Are you referring to Maximum Truth, or TrackingAI.org, or both?

Expand full comment

IntExp

May 5

I'd love to see the IQ=f(release_date) chart, that would come handy for my website

intexp.xyz

I'm working on that curve already, so maybe I'll be faster than you;)

Expand full comment

Jonathan

Jun 4

Hi Maxim,

I'm writing an article on AI intelligence vs cost - could I please use one of your graphs and link to this article?

Expand full comment

Reply (2)

Maxim Lott

Jun 4

Yes, thanks!

Expand full comment

Comment removed

Jun 14Edited

Expand full comment

Reply (1)

Nick Q.

Jun 14

@maximumtruth

here too

Expand full comment

Auspicious

Apr 21

Awesome article as usual. I'm always looking forward to your updates about the IQs of AI models, and I can't wait for your next post about AI vision progress. I think it's only a matter of time before tech like self-driving cars and smartglasses really take off. The fundamentals are already here.

Expand full comment

Tom

Apr 20Edited

I might argue with your conclusion. Are you still submitting the IQ tests as text descriptions? I would be interested in your take on this article:

https://adamkarvonen.github.io/machine_learning/2025/04/13/llm-manufacturing-eval.html.

All his general points and detailed analysis seem reasonable to me. (One of my hobbies is machining small parts.) Like the author, a large part of my intelligence is spatial reasoning and visualization, not serial (verbal, coding, music, etc.).

I grant that AI is hugely successful with 1-D problems. But I think it’s infantile in the 3-D world.

Expand full comment

Nevermind

Apr 19

So what now then?

Expand full comment

Isaac King

Apr 19

> AIs don’t feel things

Citation needed

Expand full comment

Reply (1)

Maxim Lott

Apr 19

We know what human feelings come from (eg, we know the sensations that dopamine leads to) and we know that we didn’t program that into LLMs — feelings are specific artifacts of biological evolution and needs.

Expand full comment

Reply (1)

mrmr

Apr 22

LLMs don't have feelings in the same way humans do, but does that rule out LLMs having evolved their own analogue of feelings under training pressure? I don't think we can rule it out.

Expand full comment