When a clinician asks "why does the model think this patient has sepsis?", the answer can be the difference between acceptance and rejection of an AI tool — or even between life and death. Explainability is not a feature. It is a prerequisite.
The Trust Problem in Medical AI
Despite extraordinary progress in predictive accuracy, AI adoption in clinical settings remains sluggish. Studies consistently show that physicians distrust black-box models, even when those models outperform human judgment on benchmark datasets. The reason is not ignorance — it is rational skepticism. A clinician who cannot understand why a model reached its conclusion cannot assess whether the model's reasoning is clinically sound, whether it generalises to their specific patient, or whether it has been fooled by a confounding artefact in the data.
This is particularly acute in healthcare, where mistakes are costly, patients are heterogeneous, and the model's training distribution may differ significantly from the deployment context. Trust requires transparency.
"Clinicians don't need the model to be infallible. They need to understand it well enough to know when not to trust it."
— Enea Parimbelli, CEO Bilobe
What Makes AI Explainable?
Explainability is not a single technique — it is a family of methods with different trade-offs. At Bilobe we work primarily with three approaches:
- Post-hoc feature attribution (SHAP, LIME): methods that approximate the model locally and assign importance scores to input features for a given prediction.
- Attention-based explanations: for transformer models processing clinical text, attention weights provide a window into which words or tokens influenced the output.
- Concept-based explanations (TCAV, CBM): methods that explain model behaviour in terms of human-interpretable clinical concepts rather than raw features.
Each approach has its place. Feature attribution works well for tabular data such as lab results and vital signs. Attention explanations are naturally suited to NLP tasks like discharge letter analysis. Concept-based methods shine when you need to communicate with domain experts in their own vocabulary.
Attention Maps in Diagnostic Models
Much of our applied XAI work centres on clinical natural language processing. When a model reads a discharge letter to extract diagnoses, medications, or risk indicators, attention maps let us visualise which phrases the model found most relevant. This turns an opaque vector operation into something a clinician can scan in seconds.
In our implementation for ICS Maugeri, we visualise these explanations directly in the clinical information extraction interface. Physicians can hover over any extracted entity to see the evidence trail: which sentences contributed to the extraction, with what confidence, and which tokens carried the most weight.
Our Experience at ICS Maugeri
ICS Maugeri is one of Italy's leading rehabilitation hospital networks, treating thousands of patients across cardiology, pneumology, and neurology. Our collaboration focuses on automated extraction of clinical information from Italian-language discharge letters — a notoriously noisy, abbreviated, and domain-specific document type.
When we first deployed a BERT-based extraction model without explanations, clinical acceptance was low. Physicians described the system as a "black oracle" — they could see the outputs but had no way to evaluate their trustworthiness. Errors, when they occurred, felt arbitrary and unpredictable.
After integrating attention-based explanations, the picture changed markedly. Clinicians began to identify systematic error patterns — the model's weakness for negated findings, for instance — and started applying a mental discount when the model highlighted only single words rather than full diagnostic phrases. The tool became something they could reason about.
Key Takeaways
- Explainability increases clinical adoption by making model errors predictable and understandable
- Attention mechanisms are the most actionable XAI technique for clinical NLP tasks
- Concept-based explanations are most effective for communicating with specialist clinicians
- XAI is not optional for high-risk AI under the EU AI Act — transparency requirements are legally binding
- Iterating on explanations with clinical users is as important as iterating on model accuracy
Lessons Learned and Next Steps
Three years of building and deploying XAI systems in clinical environments have taught us several hard lessons. First, explanations must be calibrated to the audience: what satisfies an informaticist is not what satisfies a cardiologist. Second, explanation fidelity matters — a beautiful but unfaithful explanation is worse than no explanation at all, because it actively misleads. Third, explanations must be fast: if retrieving an explanation adds two seconds to a query, clinicians will disable it.
Our roadmap for 2025 includes work on causal explanations — moving beyond correlation-based attribution to models that can reason about intervention effects. We are also exploring multimodal explanations that combine text highlights with structured summary tables, and integrating explanation quality metrics directly into our model evaluation pipelines.
The goal, ultimately, is not to make AI more explainable for its own sake. It is to build clinical AI systems that practitioners can use confidently, challenge appropriately, and improve collaboratively. Explainability is the foundation of that partnership.