[2405.08253] Thompson Sampling for Infinite-Horizon Discounted

[2405.08253] Thompson Sampling for Infinite-Horizon Discounted Decision Processes

arXiv - Machine Learning April 09, 2026 4 min read

About this article

Abstract page for arXiv paper 2405.08253: Thompson Sampling for Infinite-Horizon Discounted Decision Processes

Statistics > Machine Learning arXiv:2405.08253 (stat) [Submitted on 14 May 2024 (v1), last revised 8 Apr 2026 (this version, v3)] Title:Thompson Sampling for Infinite-Horizon Discounted Decision Processes Authors:Daniel Adelman, Cagla Keceli, Alba V. Olivares-Nadal View a PDF of the paper titled Thompson Sampling for Infinite-Horizon Discounted Decision Processes, by Daniel Adelman and 2 other authors View PDF Abstract:This paper develops a viable notion of learning for sampling-based algorithms that applies in broader settings than previously considered. More specifically, we model a discounted infinite-horizon MDPs with Borel state and action spaces, whose rewards and transitions depend on an unknown parameter. To analyze adaptive learning algorithms based on sampling we introduce a general canonical probability space in this setting. Since standard definitions of regret are inadequate for policy evaluation in this setting, we propose new metrics that arise from decomposing the standard expected regret in discounted infinite-horizon MDPs into three terms: (i) the expected finite-time regret, (ii) the expected state regret, and (iii) the expected residual regret. Component (i) translates into the traditional concept of expected regret over a finite horizon. Term (ii) reflects how much future performance is compromised at a given time because earlier decisions have led the system to a less favorable state than under an optimal policy. Finally, metric (iii) measures regret ...

Originally published on April 09, 2026. Curated by AI News.

Machine Learning

Artificial intelligence for robots with human-inspired hands advances and expands machine learning capabilities in the new generation of robotics.

The evolution of artificial intelligence and robotics has entered a new chapter with the launch of the GENE-26.5 model, developed by the ...

AI News - General · 10 min · about 1 hour ago

Machine Learning

Accelerating science with AI and simulations

MIT Professor Rafael Gómez-Bombarelli discusses the transformative potential of AI in scientific research, emphasizing its role in materi...

AI News - General · 10 min · about 1 hour ago

Machine Learning

Academy and ASN Joint Task Force Publishes Artificial Intelligence and Machine Learning Resource Guide

AI News - General · about 1 hour ago

Machine Learning

Zambian Student Builds Machine Learning System to Help African Farmers Adapt to Climate Change

A Zambian graduate student in the United States is developing a machine learning system designed to help African farmers decide what to p...

AI News - General · 6 min · about 1 hour ago

[2405.08253] Thompson Sampling for Infinite-Horizon Discounted Decision Processes

About this article

Related Articles

Artificial intelligence for robots with human-inspired hands advances and expands machine learning capabilities in the new generation of robotics.

Accelerating science with AI and simulations

Academy and ASN Joint Task Force Publishes Artificial Intelligence and Machine Learning Resource Guide

Zambian Student Builds Machine Learning System to Help African Farmers Adapt to Climate Change

No comments

Stay updated with AI News