[2409.17517] Dataset Distillation-based Hybrid Federated Learning on Non-IID Data
About this article
Abstract page for arXiv paper 2409.17517: Dataset Distillation-based Hybrid Federated Learning on Non-IID Data
Computer Science > Machine Learning arXiv:2409.17517 (cs) [Submitted on 26 Sep 2024 (v1), last revised 24 Mar 2026 (this version, v3)] Title:Dataset Distillation-based Hybrid Federated Learning on Non-IID Data Authors:Xiufang Shi, Wei Zhang, Yuheng Li, Mincheng Wu, Zhenyu Wen, Shibo He, Tejal Shah, Rajiv Ranjan View a PDF of the paper titled Dataset Distillation-based Hybrid Federated Learning on Non-IID Data, by Xiufang Shi and 7 other authors View PDF HTML (experimental) Abstract:In federated learning, the heterogeneity of client data has a great impact on the performance of model training. Many heterogeneity issues in this process are raised by non-independently and identically distributed (non-IID) data. To address the issue of label distribution skew, we propose a hybrid federated learning framework called HFLDD, which integrates dataset distillation to generate approximately independent and equally distributed (IID) data, thereby improving the performance of model training. In particular, we partition the clients into heterogeneous clusters, where the data labels among different clients within a cluster are unbalanced while the data labels among different clusters are balanced. The cluster heads collect distilled data from the corresponding cluster members, and conduct model training in collaboration with the server. This training process is like traditional federated learning on IID data, and hence effectively alleviates the impact of non-IID data on model training....