China's Deepseek is NOT as smart as…

Jan 29

Deepseek only matches OpenAI's FREE products in intelligence

10 Comments

Interesting post! I found DeepSeek to be pretty unimpressive - unempirical of course but my impression was that it seemed more like Chat 3.5 at best. It failed at some pretty basic stuff.

Expand full comment

Tim Duffy

Jan 29

This is cool, those benchmark results from the private version are consistent with my subjective impression. I thought R1 lacked vision capability thought so I'm a bit confused about how it's answering questions like #24. Do you intend to have R1 take your political compass test as well?

Expand full comment

Reply (1)

Maxim Lott

Jan 29

Thanks! The questions are described in words. And yes, it will be taking that quiz next.

Expand full comment

Reply (1)

Tim Duffy

Jan 29

Happy to hear that's next! I've been very curious about how reasoning models will fall on the political spectrum. I noticed that o1 is quite extreme in your testing but in the middle of the pack in David Rozado's testing, so I don't know what to think yet.

Expand full comment

Suzanne M.

Jan 29

Chat has about a 30% fail rate on questions I ask - like ‘was there ever a grandma burger at A and W. it answers and then I ask ‘are you sure’ and it apologizes.

It has happened with recipes, novels and multiple other things.

Expand full comment

Anh Hoang

Jan 31

Thanks a lot for the article. We also wrote an article about deepseek and china AI ecosystem, and chatGPT relating to NVDA stocks here:

🚨 AI just got 45x cheaper—DeepSeek built a GPT-4-level model for $5.6M, and if this scales, Nvidia’s AI monopoly might not last. 🚨

https://ghginvest.substack.com/p/ai-just-got-45x-cheaperand-it-might

Expand full comment

_ikaruga_

Jan 31Edited

"However, the Chinese government is one of the most authoritarian on Earth, and it’s plausible it will pressure China’s entrepreneurs to keep further AI developments more closely-held."

Whilst your government isn't, and it hasn't plausibly pressured their associated monopolists (or "entrepreneurs") into anything at all, nor is it plausible that it is going to do it in the coming times.

Here's to non-ideological reporting :).

Expand full comment

Reply (1)

Maxim Lott

Feb 1

It is plausible that our government would pressure companies to do the same. But the level of authoritarianism beyond that (democracy vs not, and some other human rights things) are unfortunately very different, for now.

Expand full comment

Aaron Melgar

Jan 30

Awesome! Check out my essay on pricing models based on IQ

https://blog.aaronamelgar.me/p/iqh-a-new-way-of-pricing-ai-systems

Expand full comment

_ikaruga_

Jan 31

Well, the racket about DeepSeek had a beneficial effect for GPT users: one can hardly see it as a random event that GPT made o3-mini, with its Reason feature, available today.

And in fact, there is a wide *intelligence* gap between o3-mini and o4-mini and o4.

Expand full comment

Maximum Truth

China's Deepseek is NOT as smart as…