[2603.24652] Demystifying When Pruning Works via Representation

[2603.24652] Demystifying When Pruning Works via Representation Hierarchies

arXiv - Machine Learning March 27, 2026 3 min read

About this article

Abstract page for arXiv paper 2603.24652: Demystifying When Pruning Works via Representation Hierarchies

Computer Science > Computation and Language arXiv:2603.24652 (cs) [Submitted on 25 Mar 2026] Title:Demystifying When Pruning Works via Representation Hierarchies Authors:Shwai He, Guoheng Sun, Haichao Zhang, Yun Fu, Ang Li View a PDF of the paper titled Demystifying When Pruning Works via Representation Hierarchies, by Shwai He and 4 other authors View PDF HTML (experimental) Abstract:Network pruning, which removes less important parameters or architectures, is often expected to improve efficiency while preserving performance. However, this expectation does not consistently hold across language tasks: pruned models can perform well on non-generative tasks but frequently fail in generative settings. To understand this discrepancy, we analyze network pruning from a representation-hierarchy perspective, decomposing the internal computation of language models into three sequential spaces: embedding (hidden representations), logit (pre-softmax outputs), and probability (post-softmax distributions). We find that representations in the embedding and logit spaces are largely robust to pruning-induced perturbations. However, the nonlinear transformation from logits to probabilities amplifies these deviations, which accumulate across time steps and lead to substantial degradation during generation. In contrast, the stability of the categorical-token probability subspace, together with the robustness of the embedding space, supports the effectiveness of pruning for non-generative tas...

Originally published on March 27, 2026. Curated by AI News.

Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min · about 3 hours ago

Machine Learning

[D] Looking for definition of open-world ish learning problem

Hello! Recently I did a project where I initially had around 30 target classes. But at inference, the model had to be able to handle a lo...

Reddit - Machine Learning · 1 min · about 3 hours ago

Machine Learning

Mystery Shopping Meets Machine Learning: Can Algorithms Become the Ultimate Customer Experience Auditor?

Customer expectations across Africa are shifting faster than most organisations can track. A single inconsistent interaction can ignite a...

AI News - General · 8 min · about 4 hours ago

Machine Learning

GitHub to Use User Data for AI Training by Default

submitted by /u/i-drake [link] [comments]

Reddit - Artificial Intelligence · 1 min · about 4 hours ago

[2603.24652] Demystifying When Pruning Works via Representation Hierarchies

About this article

Related Articles

UMKC Announces New Master of Science in Artificial Intelligence

[D] Looking for definition of open-world ish learning problem

Mystery Shopping Meets Machine Learning: Can Algorithms Become the Ultimate Customer Experience Auditor?

GitHub to Use User Data for AI Training by Default

No comments

Stay updated with AI News