[2603.05149] Federated Causal Discovery Across Heterogeneous Datasets under Latent Confounding
About this article
Abstract page for arXiv paper 2603.05149: Federated Causal Discovery Across Heterogeneous Datasets under Latent Confounding
Computer Science > Machine Learning arXiv:2603.05149 (cs) [Submitted on 5 Mar 2026] Title:Federated Causal Discovery Across Heterogeneous Datasets under Latent Confounding Authors:Maximilian Hahn, Alina Zajak, Dominik Heider, Adèle Helena Ribeiro View a PDF of the paper titled Federated Causal Discovery Across Heterogeneous Datasets under Latent Confounding, by Maximilian Hahn and 3 other authors View PDF Abstract:Causal discovery across multiple datasets is often constrained by data privacy regulations and cross-site heterogeneity, limiting the use of conventional methods that require a single, centralized dataset. To address these challenges, we introduce fedCI, a federated conditional independence test that rigorously handles heterogeneous datasets with non-identical sets of variables, site-specific effects, and mixed variable types, including continuous, ordinal, binary, and categorical variables. At its core, fedCI uses a federated Iteratively Reweighted Least Squares (IRLS) procedure to estimate the parameters of generalized linear models underlying likelihood-ratio tests for conditional independence. Building on this, we develop fedCI-IOD, a federated extension of the Integration of Overlapping Datasets (IOD) algorithm, that replaces its meta-analysis strategy and enables, for the fist time, federated causal discovery under latent confounding across distributed and heterogeneous datasets. By aggregating evidence federatively, fedCI-IOD not only preserves privacy but...