[2603.21276] Aggregation Alignment for Federated Learning with Mixture-of-Experts under Data Heterogeneity
About this article
Abstract page for arXiv paper 2603.21276: Aggregation Alignment for Federated Learning with Mixture-of-Experts under Data Heterogeneity
Computer Science > Machine Learning arXiv:2603.21276 (cs) [Submitted on 22 Mar 2026] Title:Aggregation Alignment for Federated Learning with Mixture-of-Experts under Data Heterogeneity Authors:Zihan Fang, Qianru Wang, Haonan An, Zheng Lin, Yiqin Deng, Xianhao Chen, Yuguang Fang View a PDF of the paper titled Aggregation Alignment for Federated Learning with Mixture-of-Experts under Data Heterogeneity, by Zihan Fang and 6 other authors View PDF HTML (experimental) Abstract:Large language models (LLMs) increasingly adopt Mixture-of-Experts (MoE) architectures to scale model capacity while reducing computation. Fine-tuning these MoE-based LLMs often requires access to distributed and privacy-sensitive data, making centralized fine-tuning impractical. Federated learning (FL) therefore provides a paradigm to collaboratively fine-tune MoE-based LLMs, enabling each client to integrate diverse knowledge without compromising data privacy. However, the integration of MoE-based LLM fine-tuning into FL encounters two critical aggregation challenges due to inherent data heterogeneity across clients: (i) divergent local data distributions drive clients to develop distinct gating preference for localized expert selection, causing direct parameter aggregation to produce a ``one-size-fits-none'' global gating network, and (ii) same-indexed experts develop disparate semantic roles across clients, leading to expert semantic blurring and the degradation of expert specialization. To address th...