[2602.21078] ProxyFL: A Proxy-Guided Framework for Federated Semi-Supervised Learning
Summary
The article presents ProxyFL, a novel framework for Federated Semi-Supervised Learning (FSSL) that addresses data heterogeneity issues by using a proxy-guided approach to optimize model training across clients with partially labeled data.
Why It Matters
As federated learning becomes increasingly important in privacy-sensitive applications, understanding how to effectively manage data heterogeneity is crucial. ProxyFL offers a new method to enhance model performance and convergence in FSSL, which could lead to more robust applications in various fields, including healthcare and finance.
Key Takeaways
- ProxyFL mitigates both external and internal data heterogeneity in FSSL.
- The framework uses a proxy to optimize global model performance against outliers.
- It re-includes discarded samples to improve training efficacy.
- Insightful experiments demonstrate significant performance improvements.
- The approach could enhance privacy-preserving collaborative learning.
Computer Science > Machine Learning arXiv:2602.21078 (cs) [Submitted on 24 Feb 2026] Title:ProxyFL: A Proxy-Guided Framework for Federated Semi-Supervised Learning Authors:Duowen Chen, Yan Wang View a PDF of the paper titled ProxyFL: A Proxy-Guided Framework for Federated Semi-Supervised Learning, by Duowen Chen and 1 other authors View PDF HTML (experimental) Abstract:Federated Semi-Supervised Learning (FSSL) aims to collaboratively train a global model across clients by leveraging partially-annotated local data in a privacy-preserving manner. In FSSL, data heterogeneity is a challenging issue, which exists both across clients and within clients. External heterogeneity refers to the data distribution discrepancy across different clients, while internal heterogeneity represents the mismatch between labeled and unlabeled data within clients. Most FSSL methods typically design fixed or dynamic parameter aggregation strategies to collect client knowledge on the server (external) and / or filter out low-confidence unlabeled samples to reduce mistakes in local client (internal). But, the former is hard to precisely fit the ideal global distribution via direct weights, and the latter results in fewer data participation into FL training. To this end, we propose a proxy-guided framework called ProxyFL that focuses on simultaneously mitigating external and internal heterogeneity via a unified proxy. I.e., we consider the learnable weights of classifier as proxy to simulate the cate...