[2602.18116] Cut Less, Fold More: Model Compression through the Lens of Projection Geometry

[2602.18116] Cut Less, Fold More: Model Compression through the Lens of Projection Geometry

arXiv - Machine Learning 3 min read Article

Summary

This paper explores model compression techniques for neural networks, focusing on projection geometry to improve accuracy and efficiency without retraining.

Why It Matters

As neural networks grow in complexity, efficient deployment becomes crucial. This research presents a novel approach to model compression that can enhance performance while reducing resource requirements, making it relevant for developers and researchers in machine learning and AI.

Key Takeaways

  • Model folding outperforms structured pruning in compression accuracy.
  • The study formalizes compression techniques as orthogonal operators.
  • Folding achieves better results under various training conditions.
  • Calibration-free methods are essential for scalable neural network deployment.
  • The research evaluates over 1000 checkpoints, providing robust empirical evidence.

Computer Science > Machine Learning arXiv:2602.18116 (cs) [Submitted on 20 Feb 2026] Title:Cut Less, Fold More: Model Compression through the Lens of Projection Geometry Authors:Olga Saukh, Dong Wang, Haris Šikić, Yun Cheng, Lothar Thiele View a PDF of the paper titled Cut Less, Fold More: Model Compression through the Lens of Projection Geometry, by Olga Saukh and 4 other authors View PDF HTML (experimental) Abstract:Compressing neural networks without retraining is vital for deployment at scale. We study calibration-free compression through the lens of projection geometry: structured pruning is an axis-aligned projection, whereas model folding performs a low-rank projection via weight clustering. We formalize both as orthogonal operators and show that, within a rank distance of one, folding provably yields smaller parameter reconstruction error, and under mild smoothness assumptions, smaller functional perturbations than pruning. At scale, we evaluate >1000 checkpoints spanning ResNet18, PreActResNet18, ViT-B/32, and CLIP ViT-B/32 on CIFAR-10 and ImageNet-1K, covering diverse training hyperparameters (optimizers, learning rates, augmentations, regularization, sharpness-aware training), as well as multiple LLaMA-family 60M and 130M parameter models trained on C4. We show that folding typically achieves higher post-compression accuracy, with the largest gains at moderate-high compression. The gap narrows and occasionally reverses at specific training setups. Our results po...

Related Articles

I can't help rooting for tiny open source AI model maker Arcee | TechCrunch
Llms

I can't help rooting for tiny open source AI model maker Arcee | TechCrunch

Arcee is a tiny 26-person U.S. startup that built a high-performing, massive, open source LLM. And it's gaining popularity with OpenClaw ...

TechCrunch - AI · 4 min ·
Machine Learning

We have an AI agent fragmentation problem

Every AI agent works fine on its own — but the moment you try to use more than one, everything falls apart. Different runtimes. Different...

Reddit - Artificial Intelligence · 1 min ·
Machine Learning

Using AI properly

AI is a tool. Period. I spent decades asking forums for help in writing HTML code for my website. I wanted my posts to self-scroll to a p...

Reddit - Artificial Intelligence · 1 min ·
Anthropic Teams Up With Its Rivals to Keep AI From Hacking Everything | WIRED
Llms

Anthropic Teams Up With Its Rivals to Keep AI From Hacking Everything | WIRED

The AI lab's Project Glasswing will bring together Apple, Google, and more than 45 other organizations. They'll use the new Claude Mythos...

Wired - AI · 7 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime