[2504.10833] Measuring the (Un)Faithfulness of Concept-Based Explanations

[2504.10833] Measuring the (Un)Faithfulness of Concept-Based Explanations

arXiv - Machine Learning 4 min read

About this article

Abstract page for arXiv paper 2504.10833: Measuring the (Un)Faithfulness of Concept-Based Explanations

Computer Science > Machine Learning arXiv:2504.10833 (cs) [Submitted on 15 Apr 2025 (v1), last revised 27 Mar 2026 (this version, v4)] Title:Measuring the (Un)Faithfulness of Concept-Based Explanations Authors:Shubham Kumar, Narendra Ahuja View a PDF of the paper titled Measuring the (Un)Faithfulness of Concept-Based Explanations, by Shubham Kumar and 1 other authors View PDF HTML (experimental) Abstract:Deep vision models perform input-output computations that are hard to interpret. Concept-based explanation methods (CBEMs) increase interpretability by re-expressing parts of the model with human-understandable semantic units, or concepts. Checking if the derived explanations are faithful -- that is, they represent the model's internal computation -- requires a surrogate that combines concepts to compute the output. Simplifications made for interpretability inevitably reduce faithfulness, resulting in a tradeoff between the two. State-of-the-art unsupervised CBEMs (U-CBEMs) are seemingly more interpretable, while also being more faithful to the model. However, we observe that the reported improvement in faithfulness artificially results from either (1) using overly complex surrogates, which introduces an unmeasured cost to the explanation's interpretability, or (2) relying on deletion-based approaches that, as we demonstrate, do not properly measure faithfulness. We propose Surrogate Faithfulness (SURF), which (1) replaces prior complex surrogates with a simple, linear sur...

Originally published on March 31, 2026. Curated by AI News.

Related Articles

Llms

Anyone here using local models mainly to keep LLM costs under control?

Been noticing that once you use LLMs for real dev work, the cost conversation gets messy fast. It is not just raw API spend. It is retrie...

Reddit - Artificial Intelligence · 1 min ·
Machine Learning

AI for Materials Science starter kit [D]

Hi everyone, I've been close to Deep Learning for a while now, and have a good grasp of the fundamentals. So for the computational chemis...

Reddit - Machine Learning · 1 min ·
‘AI-based super attacker’ threat looms as top crypto exchanges scramble for access to powerful Claude model
Llms

‘AI-based super attacker’ threat looms as top crypto exchanges scramble for access to powerful Claude model

Anthropic’s new AI model found vulnerabilities in code that has existed for years. The company said it had to restrict public access sin...

AI Tools & Products · 4 min ·
My bets on open models, mid-2026
Machine Learning

My bets on open models, mid-2026

What I expect to come next and why, focused on the open-closed gap.

AI Tools & Products · 7 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime