Your AI Will Be Wrong. Design for It.
2026-02-02 · 13 min read · Janaina Maia
Here's the conversation I have with every product team building AI features:
"What happens when the AI is wrong?"
Silence. Hand-waving. "It rarely gets it wrong." "Users can regenerate." "We'll add a feedback button."
None of these are design strategies. They're wishful thinking dressed up as product decisions.
Look, AI errors aren't bugs to be eliminated. They're a fundamental characteristic of probabilistic systems. Your AI will be wrong. Regularly. The question is whether you've designed for it or whether you're going to act surprised when it happens.
Not All AI Errors Are the Same
I classify them into five types because each needs a different UX response:
Type 1: Confident and Wrong
The most dangerous kind. AI returns a definitive-seeming result, but it's incorrect. Users may not question it. This is the one that keeps me up at night.
Type 2: Uncertain and Communicating
The AI flags low confidence. This is actually good — the system is working as designed. The challenge is presenting uncertainty without undermining trust in the whole system.
Type 3: Hallucination
The AI generates plausible-sounding but fabricated information. Cites a paper that doesn't exist. References a data point that wasn't in the input. Classic LLM behavior.
Type 4: Scope Violation
The AI answers something it shouldn't have. A geological tool offering structural engineering advice, for example.
Type 5: Stale Context
The answer was correct during training but is wrong now. Regulations changed, methods were superseded, data drifted.
Patterns That Actually Handle Errors
The Verification Prompt
For high-stakes outputs, don't just display the result — prompt the user to verify specific aspects. Highlight the claims that should be checked. Provide links to source data. Ask: "Does this match your field observations?"
This is your defense against Type 1 errors.
Graceful Degradation
When the AI can't produce a reliable result, don't show an error. Show the best available alternative:
- Full automated result (happy path)
- Partial result with gaps highlighted
- Relevant data and tools but no conclusion
- "Insufficient data" + suggested next steps
Always have a next rung on the ladder. Never dead-end.
Source Anchoring
For every AI-generated claim, provide a clickable link to the source data. If there's no source, make that visually obvious.
- Sourced claims get subtle citation markers
- Unsourced claims get a distinct treatment — lighter weight, dashed underline, "AI-generated" tag
- Let users toggle between AI interpretation and raw source data
This is the single best defense against hallucination.
The Undo Stack
Every AI action should be reversible. Not generic undo — contextual undo that restores the exact previous state.
The cost of an AI error is directly proportional to how hard it is to undo.
Visual Language for AI Errors
AI errors need their own visual system, separate from system errors:
- System errors: Red. Something is broken.
- AI uncertainty: Amber. Output needs human judgment. (NOT red — functioning as designed.)
- AI limitation: Grey/blue. Outside capability.
- AI correction: Purple/green. Learning moment.
I see teams use red for AI uncertainty all the time. It makes users think the system is failing when it's actually doing exactly what it should.
Flip Your Testing Ratio
Most teams only test AI features with good inputs. Flip that:
- 30% of testing time on happy paths
- 70% of testing time on error scenarios
Include ambiguous inputs, incomplete data, adversarial edge cases, out-of-distribution data, and temporal edge cases.
The AI products that earn lasting trust aren't the ones that never make mistakes. They're the ones that handle mistakes so well that users feel safe relying on them anyway. That's the bar. And most products aren't anywhere near it.