[2602.14972] Use What You Know: Causal Foundation Models with Partial Graphs
Summary
This paper introduces a method for enhancing Causal Foundation Models (CFMs) by incorporating partial causal graph information, improving their predictive capabilities.
Why It Matters
The ability to leverage domain knowledge in CFMs addresses a significant limitation in causal inference, allowing for more accurate predictions in various applications, particularly in fields where complete causal data is often unavailable.
Key Takeaways
- CFMs traditionally lack the integration of domain knowledge, leading to suboptimal predictions.
- The proposed method allows CFMs to utilize partial causal information effectively.
- Injecting learnable biases into the attention mechanism significantly enhances performance.
- The approach enables CFMs to match specialized models trained on specific causal structures.
- This advancement is crucial for developing all-in-one causal models capable of answering complex queries.
Computer Science > Machine Learning arXiv:2602.14972 (cs) [Submitted on 16 Feb 2026] Title:Use What You Know: Causal Foundation Models with Partial Graphs Authors:Arik Reuter, Anish Dhir, Cristiana Diaconu, Jake Robertson, Ole Ossen, Frank Hutter, Adrian Weller, Mark van der Wilk, Bernhard Schölkopf View a PDF of the paper titled Use What You Know: Causal Foundation Models with Partial Graphs, by Arik Reuter and 8 other authors View PDF HTML (experimental) Abstract:Estimating causal quantities traditionally relies on bespoke estimators tailored to specific assumptions. Recently proposed Causal Foundation Models (CFMs) promise a more unified approach by amortising causal discovery and inference in a single step. However, in their current state, they do not allow for the incorporation of any domain knowledge, which can lead to suboptimal predictions. We bridge this gap by introducing methods to condition CFMs on causal information, such as the causal graph or more readily available ancestral information. When access to complete causal graph information is too strict a requirement, our approach also effectively leverages partial causal information. We systematically evaluate conditioning strategies and find that injecting learnable biases into the attention mechanism is the most effective method to utilise full and partial causal information. Our experiments show that this conditioning allows a general-purpose CFM to match the performance of specialised models trained on spec...