[2603.25035] Mechanistically Interpreting Compression in

[2603.25035] Mechanistically Interpreting Compression in Vision-Language Models

arXiv - AI March 27, 2026 3 min read

About this article

Abstract page for arXiv paper 2603.25035: Mechanistically Interpreting Compression in Vision-Language Models

Computer Science > Artificial Intelligence arXiv:2603.25035 (cs) [Submitted on 26 Mar 2026] Title:Mechanistically Interpreting Compression in Vision-Language Models Authors:Veeraraju Elluru, Arth Singh, Roberto Aguero, Ajay Agarwal, Debojyoti Das, Hreetam Paul View a PDF of the paper titled Mechanistically Interpreting Compression in Vision-Language Models, by Veeraraju Elluru and 5 other authors View PDF HTML (experimental) Abstract:Compressed vision-language models (VLMs) are widely used to reduce memory and compute costs, making them a suitable choice for real-world deployment. However, compressing these models raises concerns about whether internal computations and safety behaviors are preserved. In this work, we use causal circuit analysis and crosscoder-based feature comparisons to examine how pruning and quantization fundamentally change the internals across representative VLMs. We observe that pruning generally keeps circuit structure intact but rotates and attenuates internal features, while quantization modifies the circuits at a higher level yet leaves the surviving features better aligned. Leveraging this insight, we also introduce VLMSafe-420, a novel benchmark that pairs harmful inputs with matched benign counterfactuals across various safety categories. Our findings show that pruning causes a sharp drop in genuine refusal behavior, suggesting that the choice of compression has safety implications. Comments: Subjects: Artificial Intelligence (cs.AI) Cite as: ...

Originally published on March 27, 2026. Curated by AI News.

Llms

[P] ClaudeFormer: Building a Transformer Out of Claudes — Collaboration Request

I'm looking to work with people interested in math, machine learning, or agentic coding, on creating a multi-agent framework to do fronti...

Reddit - Machine Learning · 1 min · about 1 hour ago

Llms

I Asked ChatGPT 500 Questions. Here Are the Ads I Saw Most Often | WIRED

Ads are rolling out across the US on ChatGPT’s free tier. I asked OpenAI's bot 500 questions to see what these ads were like and how they...

Wired - AI · 9 min · about 4 hours ago

Llms

Abacus.Ai Claw LLM consumes an incredible amount of credit without any usage :(

Three days ago, I clicked the "Deploy OpenClaw In Seconds" button to get an overview of the new service, but I didn't build any automatio...

Reddit - Artificial Intelligence · 1 min · about 4 hours ago

Llms

Google’s Gemini AI app debuts in Hong Kong

Tech giant’s chatbot service tops Apple’s app store chart in the city.

AI Tools & Products · 2 min · about 5 hours ago

[2603.25035] Mechanistically Interpreting Compression in Vision-Language Models

About this article

Related Articles

[P] ClaudeFormer: Building a Transformer Out of Claudes — Collaboration Request

I Asked ChatGPT 500 Questions. Here Are the Ads I Saw Most Often | WIRED

Abacus.Ai Claw LLM consumes an incredible amount of credit without any usage :(

Google’s Gemini AI app debuts in Hong Kong

No comments

Stay updated with AI News