[2603.03191] A Covering Framework for Offline POMDPs Learning using

[2603.03191] A Covering Framework for Offline POMDPs Learning using Belief Space Metric

arXiv - Machine Learning March 04, 2026 3 min read

About this article

Abstract page for arXiv paper 2603.03191: A Covering Framework for Offline POMDPs Learning using Belief Space Metric

Statistics > Machine Learning arXiv:2603.03191 (stat) [Submitted on 3 Mar 2026] Title:A Covering Framework for Offline POMDPs Learning using Belief Space Metric Authors:Youheng Zhu, Yiping Lu View a PDF of the paper titled A Covering Framework for Offline POMDPs Learning using Belief Space Metric, by Youheng Zhu and 1 other authors View PDF HTML (experimental) Abstract:In off policy evaluation (OPE) for partially observable Markov decision processes (POMDPs), an agent must infer hidden states from past observations, which exacerbates both the curse of horizon and the curse of memory in existing OPE methods. This paper introduces a novel covering analysis framework that exploits the intrinsic metric structure of the belief space (distributions over latent states) to relax traditional coverage assumptions. By assuming value relevant functions are Lipschitz continuous in the belief space, we derive error bounds that mitigate exponential blow ups in horizon and memory length. Our unified analysis technique applies to a broad class of OPE algorithms, yielding concrete error bounds and coverage requirements expressed in terms of belief space metrics rather than raw history coverage. We illustrate the improved sample efficiency of this framework via case studies: the double sampling Bellman error minimization algorithm, and the memory based future dependent value functions (FDVF). In both cases, our coverage definition based on the belief space metric yields tighter bounds. Subje...

Originally published on March 04, 2026. Curated by AI News.

Llms

[2601.13227] Insider Knowledge: How Much Can RAG Systems Gain from Evaluation Secrets?

Abstract page for arXiv paper 2601.13227: Insider Knowledge: How Much Can RAG Systems Gain from Evaluation Secrets?

arXiv - AI · 3 min · about 8 hours ago

Llms

[2601.22440] AI and My Values: User Perceptions of LLMs' Ability to Extract, Embody, and Explain Human Values from Casual Conversations

Abstract page for arXiv paper 2601.22440: AI and My Values: User Perceptions of LLMs' Ability to Extract, Embody, and Explain Human Value...

arXiv - AI · 4 min · about 8 hours ago

Nlp

[2601.13222] Incorporating Q&A Nuggets into Retrieval-Augmented Generation

Abstract page for arXiv paper 2601.13222: Incorporating Q&A Nuggets into Retrieval-Augmented Generation

arXiv - AI · 3 min · about 8 hours ago

Llms

[2512.01707] StreamGaze: Gaze-Guided Temporal Reasoning and Proactive Understanding in Streaming Videos

Abstract page for arXiv paper 2512.01707: StreamGaze: Gaze-Guided Temporal Reasoning and Proactive Understanding in Streaming Videos

arXiv - AI · 4 min · about 8 hours ago

[2603.03191] A Covering Framework for Offline POMDPs Learning using Belief Space Metric

About this article

Related Articles

[2601.13227] Insider Knowledge: How Much Can RAG Systems Gain from Evaluation Secrets?

[2601.22440] AI and My Values: User Perceptions of LLMs' Ability to Extract, Embody, and Explain Human Values from Casual Conversations

[2601.13222] Incorporating Q&A Nuggets into Retrieval-Augmented Generation

[2512.01707] StreamGaze: Gaze-Guided Temporal Reasoning and Proactive Understanding in Streaming Videos

No comments

Stay updated with AI News