[2603.27874] Stability and Sensitivity Analysis of Relative Temporal-Difference Learning: Extended Version

[2603.27874] Stability and Sensitivity Analysis of Relative Temporal-Difference Learning: Extended Version

arXiv - Machine Learning 3 min read

About this article

Abstract page for arXiv paper 2603.27874: Stability and Sensitivity Analysis of Relative Temporal-Difference Learning: Extended Version

Computer Science > Machine Learning arXiv:2603.27874 (cs) [Submitted on 29 Mar 2026] Title:Stability and Sensitivity Analysis of Relative Temporal-Difference Learning: Extended Version Authors:Masoud S. Sakha, Rushikesh Kamalapurkar, Sean Meyn View a PDF of the paper titled Stability and Sensitivity Analysis of Relative Temporal-Difference Learning: Extended Version, by Masoud S. Sakha and 1 other authors View PDF HTML (experimental) Abstract:Relative temporal-difference (TD) learning was introduced to mitigate the slow convergence of TD methods when the discount factor approaches one by subtracting a baseline from the temporal-difference update. While this idea has been studied in the tabular setting, stability guarantees with function approximation remain poorly understood. This paper analyzes relative TD learning with linear function approximation. We establish stability conditions for the algorithm and show that the choice of baseline distribution plays a central role. In particular, when the baseline is chosen as the empirical distribution of the state-action process, the algorithm is stable for any non-negative baseline weight and any discount factor. We also provide a sensitivity analysis of the resulting parameter estimates, characterizing both asymptotic bias and covariance. The asymptotic covariance and asymptotic bias are shown to remain uniformly bounded as the discount factor approaches one. Comments: Subjects: Machine Learning (cs.LG); Optimization and Contro...

Originally published on March 31, 2026. Curated by AI News.

Related Articles

Nlp

[D] Is lossy compression acceptable for conversational agent memory? Every system today uses knowledge graph triples — here's why I think that's wrong.

Been thinking about this and want to know if others have hit the same issue. The dominant approach for agent memory (Mem0, Zep, most RAG ...

Reddit - Machine Learning · 1 min ·
What is AI, how do apps like ChatGPT work and why are there concerns?
Llms

What is AI, how do apps like ChatGPT work and why are there concerns?

AI is transforming modern life, but some critics worry about its potential misuse and environmental impact.

AI News - General · 7 min ·
Experts warn AI is making your brain work less

Experts warn AI is making your brain work less

Generative AI tools have become hugely popular but some experts worry about the effect they have on the brain.

AI News - General · 6 min ·
WHO/Europe launches Technical Advisory Group on Artificial Intelligence for Health
Ai Startups

WHO/Europe launches Technical Advisory Group on Artificial Intelligence for Health

WHO/Europe has established the Technical Advisory Group on Artificial Intelligence for Health to ensure the ethical use of AI in health a...

AI News - General · 3 min ·

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime