ChatGPT's Hallucinations Are Reportedly Getting Worse

Don't trust everything AI tells you.

The most human thing to do is make mistakes and assume, so it seems like AI is getting more advanced: ChatGPT and similar bots are hallucinating now more than ever, providing information that is simply false.

The New York Times reports that with the introduction of reasoning systems, AI chatbots by OpenAI, Google, and DeepSeek have improved their math skills, but their work with facts has gotten worse.

OpenAI's tests show that the latest versions of their GPT model are less accurate than the old versions.

"The company found that o3 — its most powerful system — hallucinated 33 percent of the time when running its PersonQA benchmark test, which involves answering questions about public figures. That is more than twice the hallucination rate of OpenAI’s previous reasoning system, called o1. The new o4-mini hallucinated at an even higher rate: 48 percent."

In another test, SimpleQA, the hallucination rates for o3 and o4-mini were 51 percent and 79 percent, while o1 hallucinated 44 percent of the time.

The problem is that researchers can't explain why AI behaves this way – there is simply too much information bots go through.

"We still don’t know how these models work exactly," said Hannaneh Hajishirzi, a professor at the University of Washington and a researcher with the Allen Institute for Artificial Intelligence.

What's bizarre is that you'd assume that the more information AI consumes, the better its responses would be, but it's not exactly true, as we can see. Perhaps AI suffers from too much choice just as humans do.

Hallucinations are not a big deal if you use ChatGPT for fun or personal tasks, but you should not rely on it for serious work without fact-checking; it can be quite dangerous. Do you remember when Google's Gemini said to use Elmer's glue to make cheese stick to pizza?

So, it seems that we are still far away from AI fully replacing humans. Ask ChatGPT for advice but don't trust it too much.

Join our 80 Level Talent platform and our new Discord server, follow us on Instagram, Twitter, LinkedIn, Telegram, TikTok, and Threads, where we share breakdowns, the latest news, awesome artworks, and more.

Built for the Game & Digital Art Industry

Get Our Media Kit

Comments

0

Leave Comment

Built for the Game & Digital Art Industry

Get Our Media Kit

Comments

0

We need your consent