[2602.16481] Leveraging Large Language Models for Causal Discovery: a Constraint-based, Argumentation-driven Approach
Summary
This article explores the use of large language models (LLMs) in causal discovery, proposing a constraint-based, argumentation-driven approach that integrates expert knowledge and data to improve causal graph construction.
Why It Matters
Causal discovery is crucial for understanding relationships in data, impacting fields from healthcare to economics. This research leverages LLMs, which can enhance the process by providing semantic insights, thus bridging the gap between data-driven and expert-driven methodologies.
Key Takeaways
- Introduces a novel approach combining LLMs with causal discovery techniques.
- Demonstrates state-of-the-art performance on standard benchmarks.
- Proposes an evaluation protocol to address memorization bias in LLMs.
Computer Science > Artificial Intelligence arXiv:2602.16481 (cs) [Submitted on 18 Feb 2026] Title:Leveraging Large Language Models for Causal Discovery: a Constraint-based, Argumentation-driven Approach Authors:Zihao Li, Fabrizio Russo View a PDF of the paper titled Leveraging Large Language Models for Causal Discovery: a Constraint-based, Argumentation-driven Approach, by Zihao Li and Fabrizio Russo View PDF HTML (experimental) Abstract:Causal discovery seeks to uncover causal relations from data, typically represented as causal graphs, and is essential for predicting the effects of interventions. While expert knowledge is required to construct principled causal graphs, many statistical methods have been proposed to leverage observational data with varying formal guarantees. Causal Assumption-based Argumentation (ABA) is a framework that uses symbolic reasoning to ensure correspondence between input constraints and output graphs, while offering a principled way to combine data and expertise. We explore the use of large language models (LLMs) as imperfect experts for Causal ABA, eliciting semantic structural priors from variable names and descriptions and integrating them with conditional-independence evidence. Experiments on standard benchmarks and semantically grounded synthetic graphs demonstrate state-of-the-art performance, and we additionally introduce an evaluation protocol to mitigate memorisation bias when assessing LLMs for causal discovery. Comments: Subjects: A...