Most technical work on interpretability focuses on shrinking the black box: saliency maps, SHAP values, counterfactuals, rule extraction, and local surrogate models. These are useful tools — I've applied several of them across projects listed on my Publications page — but they don't always deliver the understanding clinicians actually need.

That gap is not just a usability problem; it's conceptual. Two central questions matter: What counts as an explanation? and Who is the explanation for? Answering them changes how we design, evaluate, and deploy XAI.

Technical Explanations vs. Human Understanding

Engineering-oriented explanations typically quantify feature contributions or visualize focus areas. Those outputs are great for debugging and auditing models. Yet a clinician confronted with "Feature A contributed 0.23" needs a different kind of reasoning: a justification that supports an action for a specific patient.

In short: technical explanations reveal correlations; philosophical explanations provide justifications that support decisions.

Philosophy gives useful distinctions

Some philosophical ideas I find practically helpful:

Design implications for XAI

Philosophical clarity leads to different engineering priorities:

  1. Audience-specific explanations: produce actionable, high-level rationales for clinicians; produce technical fidelity reports for model auditors; produce concise, comprehensible narratives for patients.
  2. Actionability & calibration: evaluate whether explanations help users make better decisions or appropriately calibrate trust. This requires human-in-the-loop studies, not just proxy metrics.
  3. Privacy-aware interpretability: in privacy-preserving systems (like MedHE), standard XAI probes on raw gradients are unavailable. Design explanations that operate at encrypted or aggregated abstraction levels — cohort-level summaries, uncertainty bands, and example-based explanations that do not leak sensitive data.

Concrete links to my work

Privacy-preserving federated learning projects such as MedHE demonstrate that we can retain high performance while preserving privacy — but they also highlight interpretability gaps that arise when gradients are encrypted or sparsified. Complementary projects in my publications explore fairness-aware representation learning and interpretable hybrid frameworks.

"A usable explanation for a clinician need not reveal training data; it must reveal the model's limits, typical failure modes, and the sources of uncertainty relevant to the current patient."

From metrics to meaning: evaluation suggestions

Beyond fidelity and stability, we should evaluate explanations by:

Where I'm heading

My next experiments will focus on cohort-level explanation modules compatible with privacy-preserving pipelines, clinician-in-the-loop calibration studies, and integrating normative frameworks from epistemology to inform explanation content for sensitive domains. If you're interested in collaborating on human evaluation studies or privacy-aware interpretability, I'd love to connect.