Every Single AI on the Planet Just Failed a Test That 100% of Humans Passed on Their First Try
A brutal new benchmark called ARC-AGI-3 just exposed the biggest gap between human and machine intelligence we've seen in years.
If you thought AI was getting close to matching human brains, think again. A brand new test called ARC-AGI-3 just dropped, and the results are honestly embarrassing for every major AI company on Earth.
Here's the deal: researchers created a set of puzzles that require basic reasoning, the kind of thing where you look at a pattern and figure out what comes next. Every single human who tried these puzzles solved them on their first attempt. A perfect 100% success rate.
Now here's where it gets wild. Google's best model, Gemini 3.1 Pro, scored just 0.37%. That was the HIGHEST score of any AI. Elon Musk's Grok 4.2? It scored a flat zero. Not 0.1%. Not 0.01%. Literally zero percent.
The test comes with a $2 million prize for anyone who can build an AI that actually passes it. So far, that money is looking very safe.
What does this tell us? Basically, today's AI is incredible at pattern matching and generating text, but when it comes to the kind of flexible, creative reasoning that even a child can do naturally, these systems are still miles behind. It's a humbling reminder that chatbots that sound smart and systems that actually think are two very different things.
As reported by ARC Prize Foundation and The Neuron.
Source: ARC Prize Foundation
Sponsored