[2502.07397] Linear Bandits beyond Inner Product Spaces, the case of Bandit Optimal Transport

[2502.07397] Linear Bandits beyond Inner Product Spaces, the case of Bandit Optimal Transport

arXiv - Machine Learning 4 min read Article

Summary

This paper explores linear bandits beyond traditional inner product spaces, focusing on optimal transport problems. It proposes a refined algorithm that maintains efficient learning without relying on inner product structures.

Why It Matters

Understanding linear bandits in the context of optimal transport expands the applicability of online learning algorithms. This research could enhance decision-making processes in various fields, including recommendation systems and adaptive clinical trials, by providing more flexible frameworks for learning.

Key Takeaways

  • The study challenges the reliance on inner product structures in linear bandits.
  • A new algorithm is proposed that embeds action sets into Hilbertian subspaces.
  • The algorithm achieves similar regret bounds as the classical OFUL algorithm.
  • The research highlights the importance of optimal transport in online learning.
  • Results indicate that the algorithm's regret can vary based on the cost function's regularity.

Statistics > Machine Learning arXiv:2502.07397 (stat) [Submitted on 11 Feb 2025 (v1), last revised 17 Feb 2026 (this version, v2)] Title:Linear Bandits beyond Inner Product Spaces, the case of Bandit Optimal Transport Authors:Lorenzo Croissant (CREST, FAIRPLAY, ENSAE Paris) View a PDF of the paper titled Linear Bandits beyond Inner Product Spaces, the case of Bandit Optimal Transport, by Lorenzo Croissant (CREST and 2 other authors View PDF Abstract:Linear bandits have long been a central topic in online learning, with applications ranging from recommendation systems to adaptive clinical trials. Their general learnability has been established when the objective is to minimise the inner product between a cost parameter and the decision variable. While this is highly general, this reliance on an inner product structure belies the name of \emph{linear} bandits, and fails to account for problems such as Optimal Transport. Using the Kantorovich formulation of Optimal Transport as an example, we show that an inner product structure is \emph{not} necessary to achieve efficient learning in linear bandits. We propose a refinement of the classical OFUL algorithm that operates by embedding the action set into a Hilbertian subspace, where confidence sets can be built via least-squares estimation. Actions are then constrained to this subspace by penalising optimism. The analysis is completed by leveraging convergence results from penalised (entropic) transport to the Kantorovich proble...

Related Articles

Llms

Alternative to NotebookLM with no data limits

NotebookLM is one of the best and most useful AI platforms out there, but once you start using it regularly you also feel its limitations...

Reddit - Artificial Intelligence · 1 min ·
AI: Anthropic's peek-a-boo of Claude Mythos, its next frontier model. AI-RTZ #1051
Llms

AI: Anthropic's peek-a-boo of Claude Mythos, its next frontier model. AI-RTZ #1051

...with cybersecurity industry alliance Glasswing, all ahead of mega-AI IPOs

AI Tools & Products · 10 min ·
Meta’s New AI Model Gives Mark Zuckerberg a Seat at the Big Kid’s Table
Machine Learning

Meta’s New AI Model Gives Mark Zuckerberg a Seat at the Big Kid’s Table

Muse Spark is Meta’s first model since its AI reboot, and the benchmarks suggest formidable performance.

Wired - AI · 6 min ·
Meta debuts new AI model, attempting to catch Google, OpenAI after spending billions
Machine Learning

Meta debuts new AI model, attempting to catch Google, OpenAI after spending billions

AI Tools & Products · 6 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime