[2402.11877] Learning the Model While Learning Q: Finite-Time Sample Complexity of Online SyncMBQ
About this article
Abstract page for arXiv paper 2402.11877: Learning the Model While Learning Q: Finite-Time Sample Complexity of Online SyncMBQ
Computer Science > Machine Learning arXiv:2402.11877 (cs) [Submitted on 19 Feb 2024 (v1), last revised 30 Mar 2026 (this version, v2)] Title:Learning the Model While Learning Q: Finite-Time Sample Complexity of Online SyncMBQ Authors:Han-Dong Lim, HyeAnn Lee, Donghwan Lee View a PDF of the paper titled Learning the Model While Learning Q: Finite-Time Sample Complexity of Online SyncMBQ, by Han-Dong Lim and 2 other authors View PDF HTML (experimental) Abstract:Reinforcement learning has witnessed significant advancements, particularly with the emergence of model-based approaches. Among these, $Q$-learning has proven to be a powerful algorithm in model-free settings. However, the extension of $Q$-learning to a model-based framework remains relatively unexplored. In this paper, we investigate the sample complexity of $Q$-learning when integrated with a model-based approach. The proposed algorihtms learns both the model and Q-value in an online manner. We demonstrate a near-optimal sample complexity result within a broad range of step sizes. Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI) Cite as: arXiv:2402.11877 [cs.LG] (or arXiv:2402.11877v2 [cs.LG] for this version) https://doi.org/10.48550/arXiv.2402.11877 Focus to learn more arXiv-issued DOI via DataCite Submission history From: Han-Dong Lim [view email] [v1] Mon, 19 Feb 2024 06:33:51 UTC (787 KB) [v2] Mon, 30 Mar 2026 14:38:20 UTC (532 KB) Full-text links: Access Paper: View a PDF of the paper ti...