Most technical work on interpretability focuses on shrinking the black box: saliency maps, SHAP values, counterfactuals, rule extraction, and local surrogate models. These are useful tools — I've applied several of them across projects listed on my Publications page — but they don't always deliver the understanding clinicians actually need.
That gap is not just a usability problem; it's conceptual. Two central questions matter: What counts as an explanation? and Who is the explanation for? Answering them changes how we design, evaluate, and deploy XAI.
Technical Explanations vs. Human Understanding
Engineering-oriented explanations typically quantify feature contributions or visualize focus areas. Those outputs are great for debugging and auditing models. Yet a clinician confronted with "Feature A contributed 0.23" needs a different kind of reasoning: a justification that supports an action for a specific patient.
Philosophy gives useful distinctions
Some philosophical ideas I find practically helpful:
- Epistemic opacity — systems can produce reliable results while remaining only partially intelligible to humans. Privacy-preserving pipelines (e.g., encrypted federated learning) exacerbate opacity: they make systems secure, but also harder to interrogate with standard XAI tools.
- Explanation vs. justification — an explanation describes how a model reached an output; a justification explains why acting on that output is rational for the decision-maker. Clinicians require both.
- Epistemic injustice — design choices can advantage some users as knowers and disadvantage others. Explanations that assume specific domain knowledge or language proficiency can entrench inequality.
Design implications for XAI
Philosophical clarity leads to different engineering priorities:
- Audience-specific explanations: produce actionable, high-level rationales for clinicians; produce technical fidelity reports for model auditors; produce concise, comprehensible narratives for patients.
- Actionability & calibration: evaluate whether explanations help users make better decisions or appropriately calibrate trust. This requires human-in-the-loop studies, not just proxy metrics.
- Privacy-aware interpretability: in privacy-preserving systems (like MedHE), standard XAI probes on raw gradients are unavailable. Design explanations that operate at encrypted or aggregated abstraction levels — cohort-level summaries, uncertainty bands, and example-based explanations that do not leak sensitive data.
Concrete links to my work
Privacy-preserving federated learning projects such as MedHE demonstrate that we can retain high performance while preserving privacy — but they also highlight interpretability gaps that arise when gradients are encrypted or sparsified. Complementary projects in my publications explore fairness-aware representation learning and interpretable hybrid frameworks.
"A usable explanation for a clinician need not reveal training data; it must reveal the model's limits, typical failure modes, and the sources of uncertainty relevant to the current patient."
From metrics to meaning: evaluation suggestions
Beyond fidelity and stability, we should evaluate explanations by:
- Decision impact: does the explanation improve clinical decisions in simulation or practice?
- Comprehension: can the target user paraphrase the rationale and uncertainty correctly?
- Equity of explanation: do explanations work across demographic, linguistic, and expertise differences?
Where I'm heading
My next experiments will focus on cohort-level explanation modules compatible with privacy-preserving pipelines, clinician-in-the-loop calibration studies, and integrating normative frameworks from epistemology to inform explanation content for sensitive domains. If you're interested in collaborating on human evaluation studies or privacy-aware interpretability, I'd love to connect.