[2602.18116] Cut Less, Fold More: Model Compression through the Lens of Projection Geometry

arXiv - Machine Learning February 23, 2026 3 min read Article

Summary

This paper explores model compression techniques for neural networks, focusing on projection geometry to improve accuracy and efficiency without retraining.

Why It Matters

As neural networks grow in complexity, efficient deployment becomes crucial. This research presents a novel approach to model compression that can enhance performance while reducing resource requirements, making it relevant for developers and researchers in machine learning and AI.

Key Takeaways

Model folding outperforms structured pruning in compression accuracy.
The study formalizes compression techniques as orthogonal operators.
Folding achieves better results under various training conditions.
Calibration-free methods are essential for scalable neural network deployment.
The research evaluates over 1000 checkpoints, providing robust empirical evidence.

Computer Science > Machine Learning arXiv:2602.18116 (cs) [Submitted on 20 Feb 2026] Title:Cut Less, Fold More: Model Compression through the Lens of Projection Geometry Authors:Olga Saukh, Dong Wang, Haris Šikić, Yun Cheng, Lothar Thiele View a PDF of the paper titled Cut Less, Fold More: Model Compression through the Lens of Projection Geometry, by Olga Saukh and 4 other authors View PDF HTML (experimental) Abstract:Compressing neural networks without retraining is vital for deployment at scale. We study calibration-free compression through the lens of projection geometry: structured pruning is an axis-aligned projection, whereas model folding performs a low-rank projection via weight clustering. We formalize both as orthogonal operators and show that, within a rank distance of one, folding provably yields smaller parameter reconstruction error, and under mild smoothness assumptions, smaller functional perturbations than pruning. At scale, we evaluate >1000 checkpoints spanning ResNet18, PreActResNet18, ViT-B/32, and CLIP ViT-B/32 on CIFAR-10 and ImageNet-1K, covering diverse training hyperparameters (optimizers, learning rates, augmentations, regularization, sharpness-aware training), as well as multiple LLaMA-family 60M and 130M parameter models trained on C4. We show that folding typically achieves higher post-compression accuracy, with the largest gains at moderate-high compression. The gap narrows and occasionally reverses at specific training setups. Our results po...

Read Original Article

[2602.18116] Cut Less, Fold More: Model Compression through the Lens of Projection Geometry

Summary

Why It Matters

Key Takeaways

Related Articles

I can't help rooting for tiny open source AI model maker Arcee | TechCrunch

We have an AI agent fragmentation problem

Using AI properly

Anthropic Teams Up With Its Rivals to Keep AI From Hacking Everything | WIRED

No comments

Stay updated with AI News