[2602.18870] Federated Measurement of Demographic Disparities from Quantile Sketches
Summary
This paper presents a federated learning approach to measure demographic disparities using quantile sketches, addressing privacy concerns while ensuring fairness in data analysis.
Why It Matters
As data privacy regulations tighten, federated learning offers a solution for collaborative analysis without compromising sensitive information. This research is crucial for developing fair AI systems that can accurately assess demographic disparities, which is essential for equitable decision-making in various fields, including healthcare and finance.
Key Takeaways
- Federated learning enables collaborative modeling without sharing raw data.
- The proposed method measures demographic disparity using Wasserstein-Frechet variance.
- A communication-efficient protocol allows sharing only group counts and quantile summaries.
- The approach provides finite-sample guarantees and minimizes discretization bias.
- Experiments show that a limited number of quantiles can effectively recover global disparities.
Statistics > Machine Learning arXiv:2602.18870 (stat) [Submitted on 21 Feb 2026] Title:Federated Measurement of Demographic Disparities from Quantile Sketches Authors:Arthur Charpentier, Agathe Fernandes Machado, Olivier Côté, François Hu View a PDF of the paper titled Federated Measurement of Demographic Disparities from Quantile Sketches, by Arthur Charpentier and Agathe Fernandes Machado and Olivier C\^ot\'e and Fran\c{c}ois Hu View PDF HTML (experimental) Abstract:Many fairness goals are defined at a population level that misaligns with siloed data collection, which remains unsharable due to privacy regulations. Horizontal federated learning (FL) enables collaborative modeling across clients with aligned features without sharing raw data. We study federated auditing of demographic parity through score distributions, measuring disparity as a Wasserstein--Frechet variance between sensitive-group score laws, and expressing the population metric in federated form that makes explicit how silo-specific selection drives local-global mismatch. For the squared Wasserstein distance, we prove an ANOVA-style decomposition that separates (i) selection-induced mixture effects from (ii) cross-silo heterogeneity, yielding tight bounds linking local and global metrics. We then propose a one-shot, communication-efficient protocol in which each silo shares only group counts and a quantile summary of its local score distributions, enabling the server to estimate global disparity and its d...