[2512.15405] EUBRL: Epistemic Uncertainty Directed Bayesian

[2512.15405] EUBRL: Epistemic Uncertainty Directed Bayesian Reinforcement Learning

arXiv - Machine Learning March 03, 2026 3 min read

About this article

Abstract page for arXiv paper 2512.15405: EUBRL: Epistemic Uncertainty Directed Bayesian Reinforcement Learning

Computer Science > Machine Learning arXiv:2512.15405 (cs) [Submitted on 17 Dec 2025 (v1), last revised 28 Feb 2026 (this version, v2)] Title:EUBRL: Epistemic Uncertainty Directed Bayesian Reinforcement Learning Authors:Jianfei Ma, Wee Sun Lee View a PDF of the paper titled EUBRL: Epistemic Uncertainty Directed Bayesian Reinforcement Learning, by Jianfei Ma and 1 other authors View PDF HTML (experimental) Abstract:At the boundary between the known and the unknown, an agent inevitably confronts the dilemma of whether to explore or to exploit. Epistemic uncertainty reflects such boundaries, representing systematic uncertainty due to limited knowledge. In this paper, we propose a Bayesian reinforcement learning (RL) algorithm, $\texttt{EUBRL}$, which leverages epistemic guidance to achieve principled exploration. This guidance adaptively reduces per-step regret arising from estimation errors. We establish nearly minimax-optimal regret and sample complexity guarantees for a class of sufficiently expressive priors in infinite-horizon discounted MDPs. Empirically, we evaluate $\texttt{EUBRL}$ on tasks characterized by sparse rewards, long horizons, and stochasticity. Results demonstrate that $\texttt{EUBRL}$ achieves superior sample efficiency, scalability, and consistency. Subjects: Machine Learning (cs.LG) Cite as: arXiv:2512.15405 [cs.LG] (or arXiv:2512.15405v2 [cs.LG] for this version) https://doi.org/10.48550/arXiv.2512.15405 Focus to learn more arXiv-issued DOI via Data...

Originally published on March 03, 2026. Curated by AI News.

Nlp

The Galaxy S26’s photo app can sloppify your memories | The Verge

Samsung’s S26 series offers some new AI photo editing capabilities to transform your photos. But where’s the line between acceptable edit...

The Verge - AI · 8 min · about 4 hours ago

Llms

[D] The problem with comparing AI memory system benchmarks — different evaluation methods make scores meaningless

I've been reviewing how various AI memory systems evaluate their performance and noticed a fundamental issue with cross-system comparison...

Reddit - Machine Learning · 1 min · about 9 hours ago

Machine Learning

[D] I had an idea, would love your thoughts

What happens that while training an AI during pre training we make it such that if makes "misaligned behaviour" then we just reduce like ...

Reddit - Machine Learning · 1 min · about 10 hours ago

Machine Learning

I had an idea, would love your thoughts

What happens that while training an AI during pre training we make it such that if makes "misaligned behaviour" then we just reduce like ...

Reddit - Artificial Intelligence · 1 min · about 11 hours ago

[2512.15405] EUBRL: Epistemic Uncertainty Directed Bayesian Reinforcement Learning

About this article

Related Articles

The Galaxy S26’s photo app can sloppify your memories | The Verge

[D] The problem with comparing AI memory system benchmarks — different evaluation methods make scores meaningless

[D] I had an idea, would love your thoughts

I had an idea, would love your thoughts

No comments

Stay updated with AI News