[2602.22470] Beyond performance-wise Contribution Evaluation in Federated Learning
Summary
This paper explores the limitations of current evaluation methods in federated learning, emphasizing the need for a multidimensional approach to assess client contributions beyond mere performance metrics.
Why It Matters
As federated learning gains traction, understanding the diverse contributions of participants is crucial for improving model trustworthiness. This study highlights the inadequacies of existing evaluation methods, advocating for a more holistic approach that includes reliability, resilience, and fairness, which are essential for equitable reward distribution in collaborative learning environments.
Key Takeaways
- Current evaluation methods in federated learning focus mainly on performance metrics like accuracy.
- Client contributions should also be evaluated based on reliability, resilience, and fairness.
- No single metric can comprehensively evaluate client contributions, indicating a need for multidimensional assessment.
- The study employs the Shapley value to quantify diverse contributions effectively.
- Findings suggest that clients may excel in different dimensions, necessitating a reevaluation of reward allocation strategies.
Computer Science > Machine Learning arXiv:2602.22470 (cs) [Submitted on 25 Feb 2026] Title:Beyond performance-wise Contribution Evaluation in Federated Learning Authors:Balazs Pejo View a PDF of the paper titled Beyond performance-wise Contribution Evaluation in Federated Learning, by Balazs Pejo View PDF HTML (experimental) Abstract:Federated learning offers a privacy-friendly collaborative learning framework, yet its success, like any joint venture, hinges on the contributions of its participants. Existing client evaluation methods predominantly focus on model performance, such as accuracy or loss, which represents only one dimension of a machine learning model's overall utility. In contrast, this work investigates the critical, yet overlooked, issue of client contributions towards a model's trustworthiness -- specifically, its reliability (tolerance to noisy data), resilience (resistance to adversarial examples), and fairness (measured via demographic parity). To quantify these multifaceted contributions, we employ the state-of-the-art approximation of the Shapley value, a principled method for value attribution. Our results reveal that no single client excels across all dimensions, which are largely independent from each other, highlighting a critical flaw in current evaluation scheme: no single metric is adequate for comprehensive evaluation and equitable rewarding allocation. Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR) Cite as: arXiv:2602.22...