[2603.24044] MoE-Sieve: Routing-Guided LoRA for Efficient MoE Fine-Tuning

[2603.24044] MoE-Sieve: Routing-Guided LoRA for Efficient MoE Fine-Tuning

arXiv - Machine Learning 4 min read

About this article

Abstract page for arXiv paper 2603.24044: MoE-Sieve: Routing-Guided LoRA for Efficient MoE Fine-Tuning

Computer Science > Machine Learning arXiv:2603.24044 (cs) [Submitted on 25 Mar 2026] Title:MoE-Sieve: Routing-Guided LoRA for Efficient MoE Fine-Tuning Authors:Andrea Manzoni View a PDF of the paper titled MoE-Sieve: Routing-Guided LoRA for Efficient MoE Fine-Tuning, by Andrea Manzoni View PDF HTML (experimental) Abstract:Standard LoRA fine-tuning of Mixture-of-Experts (MoE) models applies adapters to every expert, yet our profiling shows that per-layer expert routing is highly skewed: a small subset of experts handles most tokens in each layer, while many others are rarely activated ("cold"). We propose MoE-Sieve, a simple routing-guided framework for LoRA fine-tuning, and pair it with a systematic profiling study of expert routing across architectures and tasks. The method is simple: profile routing counts on a small calibration set, select the top-k most-routed experts per layer, and apply LoRA only to those experts. Across two architecturally distinct MoE models and three diverse tasks, tuning only the top 25% routed experts per layer remains competitive with full LoRA, with mean differences within +/-1 percentage point across all conditions. This reduces LoRA trainable parameters by 70-73%, adapter checkpoint size by 71-73%, and wall-clock training time by up to 50%. We also observe a non-monotonic relationship between expert count and seed-to-seed variance, consistent with the hypothesis that adapting cold experts can introduce gradient noise without improving accura...

Originally published on March 26, 2026. Curated by AI News.

Related Articles

Machine Learning

[D] ICML Reviewer Acknowledgement

Hi, I'm a little confused about ICML discussion period Does the period for reviewer acknowledging responses have already ended? One of th...

Reddit - Machine Learning · 1 min ·
Llms

Claude Opus 4.6 API at 40% below Anthropic pricing – try free before you pay anything

Hey everyone I've set up a self-hosted API gateway using [New-API](QuantumNous/new-ap) to manage and distribute Claude Opus 4.6 access ac...

Reddit - Artificial Intelligence · 1 min ·
Machine Learning

[D] ICML reviewer making up false claim in acknowledgement, what to do?

In a rebuttal acknowledgement we received, the reviewer made up a claim that our method performs worse than baselines with some hyperpara...

Reddit - Machine Learning · 1 min ·
UMKC Announces New Master of Science in Artificial Intelligence
Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime