Why researchers are creating humble AI

Why researchers are creating humble AI

If you haven’t already, subscribe and join our community in receiving weekly AI insights, updates and interviews with industry experts straight to your feed.


DeepDive 

Your weekly immersion in AI 

We’ve taught machines to give us answers. But we haven’t given those machines much room for uncertainty; for acknowledging that their answer could be wrong, or irrelevant, or just one of many possible answers. 

Now, researchers at MIT are trying to teach AI to recognise their own potential to make mistakes.  

Teaching AI to pause 

This new proposed framework from MIT explores what it calls ‘humble AI’ – systems designed not just to give answers, but to recognise when those answers might be wrong.

The focus is healthcare, where the stakes are particularly high. AI models are increasingly used to support clinical decisions – from diagnosing conditions to recommending treatments. But there’s a problem: they can sound convincingly certain, even when they’re incorrect.

And that confidence is persuasive.

Studies cited by the MIT team show that clinicians may defer to AI recommendations they perceive as authoritative – even when their own judgement disagrees.

So the risk here is misplaced trust. 

AI isn’t an oracle 

The MIT researchers propose a shift in how we design these systems.

Instead of treating AI as an oracle (a source of definitive answers), they suggest building systems that act more like collaborators. Ones that:

  • Evaluate their own level of certainty
  • Flag when evidence is incomplete
  • Ask for additional information
  • Recommend escalation to human experts when needed

As MIT researcher Leo Anthony Celi puts it in this MIT article:

AI should function as a ‘coach’ or ‘co-pilot’, not a final decision-maker.

But how do you engineer humility? 

The MIT team has developed a framework that can be layered onto existing models. At the heart of it is the idea that AI should continuously assess how confident it is and whether that confidence is justified.

One component, described as an ‘Epistemic Virtue Score,’ acts as a kind of internal check. It makes sure that the system’s confidence reflects the complexity and uncertainty of the situation.

If the model detects a mismatch (high confidence, but weak evidence) it can pause, flag the issue, and request more data before proceeding.

That might mean:

  • Asking a clinician for missing patient history
  • Suggesting further tests
  • Recommending a second opinion

In this way, the system’s goal changes from being able to answer as quickly as possible, to being able to guide better decisions. 

It’s useful beyond healthcare 

While the research is currently grounded in medicine, the implications are much broader.

AI systems are rapidly moving into domains where uncertainty is the norm: 

  • Legal reasoning
  • Scientific discovery
  • Financial decision-making
  • Public policy

And in these environments, overconfidence is a systemic risk. If a system sounds certain, we tend to believe it…even when we shouldn’t.

‘Humble AI’ changes that dynamic by allowing us to see the system’s uncertainty, and make our own decisions about what to trust. 

A different kind of intelligence

This marks a point in time for AI development. We’re starting to measure its progress by more nuanced qualities than speed and delivery. This research is part of a new focus on calibration – how well a system’s confidence matches reality. 

Because a system that’s 90% accurate but always sounds 100% certain can be more dangerous than one that is slightly less accurate, but knows when to hesitate.

That hesitation creates space for human judgement. It invites collaboration rather than compliance. And it reframes AI from a tool that replaces thinking to one that supports it.

What to watch next

The MIT team is now working to implement this framework in real clinical systems, including tools built on large-scale medical datasets.

We’re curious to see whether this design philosophy will spread. 

Will future AI systems:

  • Show confidence scores as standard?
  • Ask questions before answering?
  • Or even refuse to respond when uncertainty is too high?

And will we, as users, learn to value that restraint?

Tell us what you think 

Would you trust an AI more if it openly admitted uncertainty – or would that make it feel less capable?

Open this newsletter on LinkedIn and tell us in the comments.

Join us at DeepFest from 31 August – 3 September 2026 to hear directly from the people creating the future of tech. 

Related
articles