[2603.02217] Is Retraining-Free Enough? The Necessity of Router

[2603.02217] Is Retraining-Free Enough? The Necessity of Router Calibration for Efficient MoE Compression

arXiv - AI March 04, 2026 3 min read

About this article

Abstract page for arXiv paper 2603.02217: Is Retraining-Free Enough? The Necessity of Router Calibration for Efficient MoE Compression

Computer Science > Machine Learning arXiv:2603.02217 (cs) [Submitted on 10 Feb 2026] Title:Is Retraining-Free Enough? The Necessity of Router Calibration for Efficient MoE Compression Authors:Sieun Hyeon, Jaeyoung Do View a PDF of the paper titled Is Retraining-Free Enough? The Necessity of Router Calibration for Efficient MoE Compression, by Sieun Hyeon and 1 other authors View PDF HTML (experimental) Abstract:Mixture-of-Experts (MoE) models scale capacity efficiently, but their massive parameter footprint creates a deployment-time memory bottleneck. We organize retraining-free MoE compression into three paradigms - Expert Pruning, Expert Editing, and Expert Merging - and show that persistent post-compression degradation largely stems from a neglected factor: router-expert mismatch when experts are changed but the router is left untouched. We argue that effective retraining-free compression should avoid updating expert parameters while allowing lightweight router calibration. To this end, we propose Router Knowledge Distillation (Router KD), which updates only a tiny fraction of parameters (the router) by distilling the original model's next-token distribution on unlabeled calibration data. Experiments across representative methods in all three paradigms demonstrate consistent performance recovery, with substantially larger gains in fine-grained MoEs (many small experts) than in coarse-grained MoEs due to their more complex routing decision boundaries. Subjects: Machine L...

Originally published on March 04, 2026. Curated by AI News.

Machine Learning

Yupp shuts down after raising $33M from a16z crypto's Chris Dixon | TechCrunch

Less than a year after launching, with checks from some of the biggest names in Silicon Valley, crowdsourced AI model feedback startup Yu...

TechCrunch - AI · 4 min · about 2 hours ago

Machine Learning

[R] Fine-tuning services report

If you have some data and want to train or run a small custom model but don't have powerful enough hardware for training, fine-tuning ser...

Reddit - Machine Learning · 1 min · about 4 hours ago

Machine Learning

[D] Does ML have a "bible"/reference textbook at the Intermediate/Advanced level?

Hello, everyone! This is my first time posting here and I apologise if the question is, perhaps, a bit too basic for this sub-reddit. A b...

Reddit - Machine Learning · 1 min · about 6 hours ago

Machine Learning

[D] ICML 2026 review policy debate: 100 responses suggest Policy B may score higher, while Policy A shows higher confidence

A week ago I made a thread asking whether ICML 2026’s review policy might have affected review outcomes, especially whether Policy A pape...

Reddit - Machine Learning · 1 min · about 6 hours ago

[2603.02217] Is Retraining-Free Enough? The Necessity of Router Calibration for Efficient MoE Compression

About this article

Related Articles

Yupp shuts down after raising $33M from a16z crypto's Chris Dixon | TechCrunch

[R] Fine-tuning services report

[D] Does ML have a "bible"/reference textbook at the Intermediate/Advanced level?

[D] ICML 2026 review policy debate: 100 responses suggest Policy B may score higher, while Policy A shows higher confidence

No comments

Stay updated with AI News