[2602.14506] Covariance-Aware Transformers for Quadratic Programming and Decision Making
Summary
This paper introduces Covariance-Aware Transformers, a novel approach for solving quadratic programming (QP) problems, enhancing decision-making processes by leveraging covariance matrices.
Why It Matters
Understanding how transformers can be adapted for quadratic programming is crucial for advancing machine learning applications in decision-making scenarios, particularly in finance and optimization. This research highlights the potential for improved performance in portfolio optimization, showcasing a shift towards more sophisticated AI methodologies.
Key Takeaways
- Transformers can effectively solve unconstrained quadratic programs using a linear attention mechanism.
- The proposed Time2Decide method enhances time series models by incorporating covariance matrices.
- Empirical results show that Time2Decide outperforms traditional methods in portfolio optimization.
Computer Science > Machine Learning arXiv:2602.14506 (cs) [Submitted on 16 Feb 2026] Title:Covariance-Aware Transformers for Quadratic Programming and Decision Making Authors:Kutay Tire, Yufan Zhang, Ege Onur Taga, Samet Oymak View a PDF of the paper titled Covariance-Aware Transformers for Quadratic Programming and Decision Making, by Kutay Tire and 3 other authors View PDF HTML (experimental) Abstract:We explore the use of transformers for solving quadratic programs and how this capability benefits decision-making problems that involve covariance matrices. We first show that the linear attention mechanism can provably solve unconstrained QPs by tokenizing the matrix variables (e.g.~$A$ of the objective $\frac{1}{2}x^\top Ax+b^\top x$) row-by-row and emulating gradient descent iterations. Furthermore, by incorporating MLPs, a transformer block can solve (i) $\ell_1$-penalized QPs by emulating iterative soft-thresholding and (ii) $\ell_1$-constrained QPs when equipped with an additional feedback loop. Our theory motivates us to introduce Time2Decide: a generic method that enhances a time series foundation model (TSFM) by explicitly feeding the covariance matrix between the variates. We empirically find that Time2Decide uniformly outperforms the base TSFM model for the classical portfolio optimization problem that admits an $\ell_1$-constrained QP formulation. Remarkably, Time2Decide also outperforms the classical "Predict-then-Optimize (PtO)" procedure, where we first fore...