Why Karl Friston is betting on cultivating curiosity for sustainable AGI

George Lawton

March 26, 2026

Dyslexia mode

Summary:: The ARC-AGI-3 challenge launches today as the first interactive reasoning benchmark which stumps current frontier LLMs. Karl Friston, one of the most cited neuroscientists alive and Chief Scientist at VERSES AI, argues that his active inference framework can compete where LLMs fail because it does something they structurally cannot: encode uncertainty and be curious. The questions Friston is asking may outlast the answers Big AI is selling.

The ARC-AGI-3 challenge, which launches today, is the first interactive reasoning benchmark designed to expose what Large Language Models (LLMs) fundamentally cannot do: explore, form hypotheses, and learn from uncertainty. Current frontier AI models score zero on the preview. Humans solve the same puzzles in minutes. The gap tells you something important about what is missing from the dominant AI paradigm.

That gap is the territory Karl Friston, Chief Scientist at VERSES AI, a small Canadian cognitive computing company, has been working in for decades. Friston is one of the most cited neuroscientists alive, and his work on active inference, a mathematical framework for how brains make sense of the world by encoding and reducing uncertainty, has become the dominant paradigm in computational neuroscience. Now he and his team are building toward demonstrating that this framework can compete with approaches backed by billions of dollars in compute. Friston says of the ARC-AGI-3 challenge:

The ARC-AGI-3 challenge is designed so that you can't hack it with a large language model. So you can't hack it with current tech. This is not interesting anymore. This is what is interesting. It's basically reproducing true general intelligence, read as natural intelligence. The capabilities that speak to your creativity, speak to your curiosity, all the different hypotheses that we bring to the table. But to resolve our uncertainty about hypotheses, we again come back to this imperative to minimize uncertainty.

VERSES is a dark horse in this race. Its stock price has collapsed from several hundred million to a fraction of that, the original founders have departed, and there have been recent layoffs. But VERSES has already demonstrated that its active inference engine, AXIOM, can outperform DeepMind's DreamerV3 by up to 60% on game benchmarks while using 97% less compute and learning 39 times faster. ARC-AGI-3 is a much bigger stretch, but it tests exactly the capabilities that active inference was designed for.

When the machine can't be curious

Friston's path to AI started in psychiatry. Working with schizophrenic patients in the early 1990s, he noticed that their core deficit was not in processing information but in assigning the right level of confidence to different sources of evidence. Get the uncertainty wrong and you cannot do good inference. You start seeing things that are not there or failing to see things that are. This insight, refined through decades of brain imaging research alongside colleagues including Geoffrey Hinton, eventually crystallized into the free energy principle and active inference.

The connection to current AI is more direct than it might appear. Friston observes that LLMs suffer from the same structural deficit:

So it looks as if most of psychiatric and neurological disorders can be explained by a failure to encode uncertainty, or its complement. In my world that's called precision, or confidence, the credence you give to certain kinds of evidence. And if that's right, what you are saying is that large language models, in their failure to encode uncertainty, are extremely prone to psychiatric disorders. And it's interesting that people like Gary Marcus sort of talk about hallucinations, and that's been taken up as a meme. From my point of view, that is a delightful convergence of the actual maths, computational psychiatry and the understanding of Bayes-optimal sense-making, because it is exactly the kind of false inference you get when you fail to do proper uncertainty quantification.

This is not a philosophical objection. It is a mechanical one. Without an encoding of uncertainty, a system cannot work out what it does not know, which means it cannot figure out what information would be most valuable to seek. Friston frames the consequences of this:

In order to know how to forage epistemically, how to ask the right questions, you have to encode your uncertainty. So if you're a large language model, you have no notion of uncertainty. You don't know what's going to resolve it. Which means that the large language model can never ask you a question. It can never be curious. So at that point, you put this Bayesian mechanics, adaptive inference, into large language models, they'll start prompting you, because they'll be interested in resolving uncertainty about you. And at that point, you know, I think there'll be interesting questions about machine consciousness and true machine psychosis, where the machine could actually become psychotic beyond just having hallucinations.

The rub is that LLMs sort of can ask questions today by building a facsimile of curiosity into LLMs. You can prompt them to ask whether their answer was sufficient, to provoke follow-up questions. But this does not mean they are truly reckoning with uncertainty, although this process could be useful in specific contexts. The distinction matters: a system that simulates curiosity through prompts and a system that is genuinely driven to reduce its own uncertainty are doing something fundamentally different, even if they sometimes look the same to a casual observer.

Curiosity on a budget

The prevailing AI investment thesis is that progress requires ever-larger data centers, ever-more compute, and ever-bigger models. Friston argues that this is not just wasteful but unnecessary:

No scientist goes out and trawls data. They carefully design an experiment that generates exactly the right kind of data to resolve their uncertainty about their hypothesis. That's one expression of just being curious. So we design experiments to generate the right data. Not the big data. So it's a complete waste of time, from my perspective, investing in these data warehouses and all this infrastructure.

So why is a scientist of Friston's stature at a struggling startup rather than at a lab with billions behind it? The answer is simpler and more human than you might expect:

The reason I'm committed, and was committed, to VERSES on my arrival, was simply because all my favorite students got jobs there. It is exactly that, being able to work with a small group of like-minded, inquisitive young people who want to explore the best way forward and be part of that endeavor.

Sitting with Friston, I did not get the impression he felt like he was missing out on the opportunities of working with a much larger AI company. He works from home most days, chewing on whatever questions happen to provoke his curiosity, collaborating with hundreds of researchers around the world. There is a quality of patience and childlike wonder that is different from most people in the AI space.

That said, VERSES has had real business challenges. The company has gone through a leadership transition with the recent appointment of CEO David Scott, a former AWS marketing and operations executive. They have narrowed their focus from a broad spatial web vision to a single near-term use case: improving financial models for bond traders. Revenue went from zero to $400,000 last quarter. The near-term opportunity is not necessarily in diving right into the grand vision but in improving simpler models, like the bond trading work, without waiting for broader infrastructure adoption.

The intellectual frontier Friston is most excited about extends beyond any single product. His team has been working on what happens when you put active inference agents in communication with each other, which he calls belief sharing. The key insight is that sharing beliefs between systems requires sharing uncertainty, not just conclusions. He says:

Certainly in my world the key questions now are about what happens when you put two active inference agents together. So we're now talking about a society of mind or ecosystems of intelligence. The interesting thing is that it all depends upon belief sharing. And I'm using belief here not in some vernacular sense. I'm talking about a Bayesian belief, a conditional probability distribution that comes equipped with an uncertainty.

This is early-stage research, but it points toward how humans and AI systems might communicate about uncertainty in ways that are genuinely useful in medicine, finance, manufacturing, and anywhere that edge cases matter more than averages.

My take

To understand Karl Friston, you have to step into a different frame of mind than the one Big AI is operating from: the big data centers, the big compute, the big domination mindset. Friston is inviting us on a different journey. The AI he is talking about is something that is curious, something that is patient, something that actually needs us.

I have been sitting between all of the various tensions and uncertainty across many dimensions to reflect on why active inference is so hard to explain. It is not just the vocabulary, where words like "precision" and "confidence" point to the opposite of what you would expect in broadcast English. It is that the whole paradigm requires a willingness to sit with uncertainty rather than eliminate it. People want the answer. Karl is saying the question is the answer, or at least that asking the right question is the most important thing you can do.

There is a very real possibility that what Friston is doing will not be successful in a commercialized way. A lot of money is betting that Big AI will need big data centers and big energy. When you play that forward, it does not look so good. Karl is calling into being a future that actually needs us, that needs curiosity.