[2208.02389] Risk-Aware Linear Bandits: Theory and Applications in Smart Order Routing
About this article
Abstract page for arXiv paper 2208.02389: Risk-Aware Linear Bandits: Theory and Applications in Smart Order Routing
Computer Science > Machine Learning arXiv:2208.02389 (cs) [Submitted on 4 Aug 2022 (v1), last revised 2 Apr 2026 (this version, v3)] Title:Risk-Aware Linear Bandits: Theory and Applications in Smart Order Routing Authors:Jingwei Ji, Renyuan Xu, Ruihao Zhu View a PDF of the paper titled Risk-Aware Linear Bandits: Theory and Applications in Smart Order Routing, by Jingwei Ji and 2 other authors View PDF HTML (experimental) Abstract:Motivated by practical considerations in machine learning for financial decision-making, such as risk aversion and large action space, we consider risk-aware bandits optimization with applications in smart order routing (SOR). Specifically, based on preliminary observations of linear price impacts made from the NASDAQ ITCH dataset, we initiate the study of risk-aware linear bandits. In this setting, we aim at minimizing regret, which measures our performance deficit compared to the optimum's, under the mean-variance metric when facing a set of actions whose rewards are linear functions of (initially) unknown parameters. Driven by the variance-minimizing globally-optimal (G-optimal) design, we propose the novel instance-independent Risk-Aware Explore-then-Commit (RISE) algorithm and the instance-dependent Risk-Aware Successive Elimination (RISE++) algorithm. Then, we rigorously analyze their near-optimal regret upper bounds to show that, by leveraging the linear structure, our algorithms can dramatically reduce the regret when compared to existing ...