[2504.06768] FedMerge: Federated Personalization via Model Merging
Summary
The paper introduces FedMerge, a novel approach in federated learning that enables personalized model creation for clients by merging multiple global models with optimized weights, improving performance on non-IID tasks without local finetuning.
Why It Matters
FedMerge addresses a critical challenge in federated learning—serving diverse clients with non-IID data distributions. By allowing personalized model merging, it enhances model performance and reduces the need for extensive local adjustments, making federated learning more efficient and applicable across various domains.
Key Takeaways
- FedMerge allows for personalized model creation by merging multiple global models.
- The approach eliminates the need for local finetuning, streamlining the process.
- FedMerge outperforms existing federated learning methods in non-IID settings.
- The method optimizes merging weights for each client, enhancing model alignment.
- It addresses client drift effectively, improving consistency in model training.
Computer Science > Machine Learning arXiv:2504.06768 (cs) [Submitted on 9 Apr 2025 (v1), last revised 18 Feb 2026 (this version, v3)] Title:FedMerge: Federated Personalization via Model Merging Authors:Shutong Chen, Tianyi Zhou, Guodong Long, Jing Jiang, Chengqi Zhang View a PDF of the paper titled FedMerge: Federated Personalization via Model Merging, by Shutong Chen and 4 other authors View PDF HTML (experimental) Abstract:One global model in federated learning (FL) might not be sufficient to serve many clients with non-IID tasks and distributions. While there has been advances in FL to train multiple global models for better personalization, they only provide limited choices to clients so local finetuning is still indispensable. In this paper, we propose a novel ``FedMerge'' approach that can create a personalized model per client by simply merging multiple global models with automatically optimized and customized weights. In FedMerge, a few global models can serve many non-IID clients, even without further local finetuning. We formulate this problem as a joint optimization of global models and the merging weights for each client. Unlike existing FL approaches where the server broadcasts one or multiple global models to all clients, the server only needs to send a customized, merged model to each client. Moreover, instead of periodically interrupting the local training and re-initializing it to a global model, the merged model aligns better with each client's task and d...