[2505.16308] Beyond All-to-All: Causal-Aligned Transformer with Dynamic Structure Learning for Multivariate Time Series Forecasting
Summary
This article presents a novel approach to multivariate time series forecasting using a Causal Decomposition Transformer (CDT) that learns dynamic causal structures, improving prediction accuracy by addressing limitations of traditional all-to-all models.
Why It Matters
As multivariate time series data becomes increasingly prevalent in various fields, understanding causal relationships is essential for accurate forecasting. This research introduces a method that enhances predictive performance by distinguishing between different types of causal influences, which can lead to better decision-making in fields like finance, healthcare, and climate science.
Key Takeaways
- The proposed all-to-one forecasting paradigm improves accuracy by predicting each target variable separately.
- The Causal Decomposition Transformer (CDT) incorporates dynamic causal adapters to refine causal structure during training.
- The research addresses issues of spurious correlations and collider bias, enhancing model robustness.
- Extensive experiments validate the CDT's effectiveness across multiple benchmark datasets.
- Understanding causal influences can significantly improve forecasting in various applications.
Computer Science > Machine Learning arXiv:2505.16308 (cs) [Submitted on 22 May 2025 (v1), last revised 13 Feb 2026 (this version, v2)] Title:Beyond All-to-All: Causal-Aligned Transformer with Dynamic Structure Learning for Multivariate Time Series Forecasting Authors:Xingyu Zhang, Hanyun Du, Zeen Song, Siyu Zhao, Changwen Zheng, Wenwen Qiang View a PDF of the paper titled Beyond All-to-All: Causal-Aligned Transformer with Dynamic Structure Learning for Multivariate Time Series Forecasting, by Xingyu Zhang and 5 other authors View PDF HTML (experimental) Abstract:Most existing multivariate time series forecasting methods adopt an all-to-all paradigm that feeds all variable histories into a unified model to predict their future values without distinguishing their individual roles. However, this undifferentiated paradigm makes it difficult to identify variable-specific causal influences and often entangles causally relevant information with spurious correlations. To address this limitation, we propose an all-to-one forecasting paradigm that predicts each target variable separately. Specifically, we first construct a Structural Causal Model from observational data and then, for each target variable, we partition the historical sequence into four subsegments according to the inferred causal structure: endogenous, direct causal, collider causal, and spurious correlation. Furthermore, we propose the Causal Decomposition Transformer (CDT), which integrates a dynamic causal adapter...