[2504.10833] Measuring the (Un)Faithfulness of Concept-Based

[2504.10833] Measuring the (Un)Faithfulness of Concept-Based Explanations

arXiv - Machine Learning March 31, 2026 4 min read

About this article

Abstract page for arXiv paper 2504.10833: Measuring the (Un)Faithfulness of Concept-Based Explanations

Computer Science > Machine Learning arXiv:2504.10833 (cs) [Submitted on 15 Apr 2025 (v1), last revised 27 Mar 2026 (this version, v4)] Title:Measuring the (Un)Faithfulness of Concept-Based Explanations Authors:Shubham Kumar, Narendra Ahuja View a PDF of the paper titled Measuring the (Un)Faithfulness of Concept-Based Explanations, by Shubham Kumar and 1 other authors View PDF HTML (experimental) Abstract:Deep vision models perform input-output computations that are hard to interpret. Concept-based explanation methods (CBEMs) increase interpretability by re-expressing parts of the model with human-understandable semantic units, or concepts. Checking if the derived explanations are faithful -- that is, they represent the model's internal computation -- requires a surrogate that combines concepts to compute the output. Simplifications made for interpretability inevitably reduce faithfulness, resulting in a tradeoff between the two. State-of-the-art unsupervised CBEMs (U-CBEMs) are seemingly more interpretable, while also being more faithful to the model. However, we observe that the reported improvement in faithfulness artificially results from either (1) using overly complex surrogates, which introduces an unmeasured cost to the explanation's interpretability, or (2) relying on deletion-based approaches that, as we demonstrate, do not properly measure faithfulness. We propose Surrogate Faithfulness (SURF), which (1) replaces prior complex surrogates with a simple, linear sur...

Originally published on March 31, 2026. Curated by AI News.

Llms

Anyone here using local models mainly to keep LLM costs under control?

Been noticing that once you use LLMs for real dev work, the cost conversation gets messy fast. It is not just raw API spend. It is retrie...

Reddit - Artificial Intelligence · 1 min · 5 minutes ago

Machine Learning

AI for Materials Science starter kit [D]

Hi everyone, I've been close to Deep Learning for a while now, and have a good grasp of the fundamentals. So for the computational chemis...

Reddit - Machine Learning · 1 min · 20 minutes ago

Llms

‘AI-based super attacker’ threat looms as top crypto exchanges scramble for access to powerful Claude model

Anthropic’s new AI model found vulnerabilities in code that has existed for years. The company said it had to restrict public access sin...

AI Tools & Products · 4 min · 35 minutes ago

Machine Learning

My bets on open models, mid-2026

What I expect to come next and why, focused on the open-closed gap.

AI Tools & Products · 7 min · 35 minutes ago

[2504.10833] Measuring the (Un)Faithfulness of Concept-Based Explanations

About this article

Related Articles

Anyone here using local models mainly to keep LLM costs under control?

AI for Materials Science starter kit [D]

‘AI-based super attacker’ threat looms as top crypto exchanges scramble for access to powerful Claude model

My bets on open models, mid-2026

No comments

Stay updated with AI News