[2509.25800] Characterization and Learning of Causal Graphs with Latent Confounders and Post-treatment Selection from Interventional Data
Summary
This paper presents a novel approach to causal discovery that accounts for latent confounders and post-treatment selection, enhancing the accuracy of causal inference from interventional data.
Why It Matters
Understanding causal relationships is crucial in fields like biology and social sciences. This research addresses common pitfalls in causal discovery, particularly the impact of post-treatment selection, which can lead to misleading conclusions. By introducing a new causal formulation and algorithm, the study enhances the reliability of causal inference, which is vital for effective decision-making in research and application.
Key Takeaways
- Introduces a novel causal formulation that models post-treatment selection.
- Highlights the challenges posed by latent confounders in causal discovery.
- Presents the FI-Markov equivalence class for improved causal structure identification.
- Develops the F-FCI algorithm for identifying causal relations and confounders.
- Demonstrates effectiveness through experiments on synthetic and real-world datasets.
Computer Science > Machine Learning arXiv:2509.25800 (cs) [Submitted on 30 Sep 2025 (v1), last revised 25 Feb 2026 (this version, v2)] Title:Characterization and Learning of Causal Graphs with Latent Confounders and Post-treatment Selection from Interventional Data Authors:Gongxu Luo, Loka Li, Guangyi Chen, Haoyue Dai, Kun Zhang View a PDF of the paper titled Characterization and Learning of Causal Graphs with Latent Confounders and Post-treatment Selection from Interventional Data, by Gongxu Luo and 4 other authors View PDF HTML (experimental) Abstract:Interventional causal discovery seeks to identify causal relations by leveraging distributional changes introduced by interventions, even in the presence of latent confounders. Beyond the spurious dependencies induced by latent confounders, we highlight a common yet often overlooked challenge in the problem due to post-treatment selection, in which samples are selectively included in datasets after interventions. This fundamental challenge widely exists in biological studies; for example, in gene expression analysis, both observational and interventional samples are retained only if they meet quality control criteria (e.g., highly active cells). Neglecting post-treatment selection may introduce spurious dependencies and distributional changes under interventions, which can mimic causal responses, thereby distorting causal discovery results and challenging existing causal formulations. To address this, we introduce a novel c...