[2510.09908] Learning with Incomplete Context: Linear Contextual Bandits with Pretrained Imputation

[2510.09908] Learning with Incomplete Context: Linear Contextual Bandits with Pretrained Imputation

arXiv - Machine Learning 3 min read

About this article

Abstract page for arXiv paper 2510.09908: Learning with Incomplete Context: Linear Contextual Bandits with Pretrained Imputation

Statistics > Machine Learning arXiv:2510.09908 (stat) [Submitted on 10 Oct 2025 (v1), last revised 1 Apr 2026 (this version, v3)] Title:Learning with Incomplete Context: Linear Contextual Bandits with Pretrained Imputation Authors:Hao Yan, Heyan Zhang, Yongyi Guo View a PDF of the paper titled Learning with Incomplete Context: Linear Contextual Bandits with Pretrained Imputation, by Hao Yan and 2 other authors View PDF HTML (experimental) Abstract:The rise of large-scale pretrained models has made it feasible to generate predictive or synthetic features at low cost, raising the question of how to incorporate such surrogate predictions into downstream decision-making. We study this problem in the setting of online linear contextual bandits, where contexts may be complex, nonstationary, and only partially observed. In addition to bandit data, we assume access to an auxiliary dataset containing fully observed contexts--common in practice since such data are collected without adaptive interventions. We propose PULSE-UCB, an algorithm that leverages pretrained models trained on the auxiliary data to impute missing features during online decision-making. We establish regret guarantees that decompose into a standard bandit term plus an additional component reflecting pretrained model quality. In the i.i.d. context case with Hölder-smooth missing features, PULSE-UCB achieves near-optimal performance, supported by matching lower bounds. Our results quantify how uncertainty in predi...

Originally published on April 03, 2026. Curated by AI News.

Related Articles

Now Meta will track what employees do on their computers to train its AI agents | The Verge
Machine Learning

Now Meta will track what employees do on their computers to train its AI agents | The Verge

Meta is reportedly using tracking software to record its employees’ mouse and keyboard activity for training data for its AI agents.

The Verge - AI · 4 min ·
Llms

Training-time intervention yields 63.4% blind-pair human preference at matched val-loss (1.2B params, 320 judgments, p = 1.98 × 10⁻⁵) [R]

TL;DR. I ran a blind A/B preference evaluation between two 1.2B-parameter LMs trained on identical data (same order, same seed, 30K steps...

Reddit - Machine Learning · 1 min ·
Machine Learning

I can't believe text normalization is so underdiscussed in streaming text-to-speech [D]

Kinda suprises me how little discussion there is around about mistakes in streaming TTS models People look for natural readers, high voic...

Reddit - Machine Learning · 1 min ·
Anthropic’s most dangerous AI model just fell into the wrong hands | The Verge
Machine Learning

Anthropic’s most dangerous AI model just fell into the wrong hands | The Verge

Anthropic’s powerful Mythos cybersecurity AI model has been accessed by a “small group of unauthorised users.”

The Verge - AI · 4 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime