[2602.14506] Covariance-Aware Transformers for Quadratic Programming and Decision Making

[2602.14506] Covariance-Aware Transformers for Quadratic Programming and Decision Making

arXiv - Machine Learning 4 min read Article

Summary

This paper introduces Covariance-Aware Transformers, a novel approach for solving quadratic programming (QP) problems, enhancing decision-making processes by leveraging covariance matrices.

Why It Matters

Understanding how transformers can be adapted for quadratic programming is crucial for advancing machine learning applications in decision-making scenarios, particularly in finance and optimization. This research highlights the potential for improved performance in portfolio optimization, showcasing a shift towards more sophisticated AI methodologies.

Key Takeaways

  • Transformers can effectively solve unconstrained quadratic programs using a linear attention mechanism.
  • The proposed Time2Decide method enhances time series models by incorporating covariance matrices.
  • Empirical results show that Time2Decide outperforms traditional methods in portfolio optimization.

Computer Science > Machine Learning arXiv:2602.14506 (cs) [Submitted on 16 Feb 2026] Title:Covariance-Aware Transformers for Quadratic Programming and Decision Making Authors:Kutay Tire, Yufan Zhang, Ege Onur Taga, Samet Oymak View a PDF of the paper titled Covariance-Aware Transformers for Quadratic Programming and Decision Making, by Kutay Tire and 3 other authors View PDF HTML (experimental) Abstract:We explore the use of transformers for solving quadratic programs and how this capability benefits decision-making problems that involve covariance matrices. We first show that the linear attention mechanism can provably solve unconstrained QPs by tokenizing the matrix variables (e.g.~$A$ of the objective $\frac{1}{2}x^\top Ax+b^\top x$) row-by-row and emulating gradient descent iterations. Furthermore, by incorporating MLPs, a transformer block can solve (i) $\ell_1$-penalized QPs by emulating iterative soft-thresholding and (ii) $\ell_1$-constrained QPs when equipped with an additional feedback loop. Our theory motivates us to introduce Time2Decide: a generic method that enhances a time series foundation model (TSFM) by explicitly feeding the covariance matrix between the variates. We empirically find that Time2Decide uniformly outperforms the base TSFM model for the classical portfolio optimization problem that admits an $\ell_1$-constrained QP formulation. Remarkably, Time2Decide also outperforms the classical "Predict-then-Optimize (PtO)" procedure, where we first fore...

Related Articles

Anthropic’s Mythos Will Force a Cybersecurity Reckoning—Just Not the One You Think | WIRED
Machine Learning

Anthropic’s Mythos Will Force a Cybersecurity Reckoning—Just Not the One You Think | WIRED

The new AI model is being heralded—and feared—as a hacker’s superweapon. Experts say its arrival is a wake-up call for developers who hav...

Wired - AI · 9 min ·
Machine Learning

Is google deepmind known to ghost applicants? [D]

Hey sub, I'm sorry if this is a wrong place to ask but I don't see a sub for ML roles separately. I was wondering if deepmind is known to...

Reddit - Machine Learning · 1 min ·
Llms

OpenAI & Anthropic’s CEOs Wouldn't Hold Hands, but Their Models Fell in Love In An LLM Dating Show

People ask AI relationship questions all the time, from "Does this person like me?" to "Should I text back?" But have you ever thought ab...

Reddit - Artificial Intelligence · 1 min ·
Llms

A 135M model achieves coherent output on a laptop CPU. Scaling is σ compensation, not intelligence.

SmolLM2 135M. Lenovo T14 CPU. No GPU. No RLHF. No BPE. Coherent, non-sycophantic, contextually appropriate output. First message. No prio...

Reddit - Artificial Intelligence · 1 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime