[2601.17160] Information-Theoretic Causal Bounds under Unmeasured Confounding
Summary
This paper presents a novel information-theoretic framework for identifying causal effects in the presence of unmeasured confounding, overcoming limitations of existing methods.
Why It Matters
Understanding causal relationships is crucial in many fields, including economics and healthcare. This framework allows for more accurate causal inference from observational data, which can lead to better decision-making and policy formulation without the need for restrictive assumptions.
Key Takeaways
- Introduces a data-driven method for causal identification without external inputs.
- Establishes bounds for causal effects using f-divergence and propensity scores.
- Demonstrates applicability through simulation studies and real-world data.
- Provides a semiparametric estimator ensuring consistent inference.
- Addresses limitations of existing causal inference methods.
Statistics > Machine Learning arXiv:2601.17160 (stat) [Submitted on 23 Jan 2026 (v1), last revised 20 Feb 2026 (this version, v3)] Title:Information-Theoretic Causal Bounds under Unmeasured Confounding Authors:Yonghan Jung, Bogyeong Kang View a PDF of the paper titled Information-Theoretic Causal Bounds under Unmeasured Confounding, by Yonghan Jung and 1 other authors View PDF HTML (experimental) Abstract:We develop a data-driven information-theoretic framework for sharp partial identification of causal effects under unmeasured confounding. Existing approaches often rely on restrictive assumptions, such as bounded or discrete outcomes; require external inputs (for example, instrumental variables, proxies, or user-specified sensitivity parameters); necessitate full structural causal model specifications; or focus solely on population-level averages while neglecting covariate-conditional effects. We overcome all four limitations simultaneously by establishing novel information-theoretic, data-driven divergence bounds. Our key theoretical contribution shows that the f-divergence between the observational distribution P(Y | A = a, X = x) and the interventional distribution P(Y | do(A = a), X = x) is upper bounded by a function of the propensity score alone. This result enables sharp partial identification of conditional causal effects directly from observational data, without requiring external sensitivity parameters, auxiliary variables, full structural specifications, or out...