[2602.04755] When Silence Is Golden: Can LLMs Learn to Abstain in Temporal QA and Beyond?

[2602.04755] When Silence Is Golden: Can LLMs Learn to Abstain in Temporal QA and Beyond?

arXiv - AI 4 min read

About this article

Abstract page for arXiv paper 2602.04755: When Silence Is Golden: Can LLMs Learn to Abstain in Temporal QA and Beyond?

Computer Science > Computation and Language arXiv:2602.04755 (cs) [Submitted on 4 Feb 2026 (v1), last revised 4 Mar 2026 (this version, v2)] Title:When Silence Is Golden: Can LLMs Learn to Abstain in Temporal QA and Beyond? Authors:Xinyu Zhou, Chang Jin, Carsten Eickhoff, Zhijiang Guo, Seyed Ali Bahrainian View a PDF of the paper titled When Silence Is Golden: Can LLMs Learn to Abstain in Temporal QA and Beyond?, by Xinyu Zhou and 4 other authors View PDF HTML (experimental) Abstract:Large language models (LLMs) rarely admit uncertainty, often producing fluent but misleading answers, rather than abstaining (i.e., refusing to answer). This weakness is even evident in temporal question answering, where models frequently ignore time-sensitive evidence and conflate facts across different time-periods. In this paper, we present the first empirical study of training LLMs with an abstention ability while reasoning about temporal QA. Existing approaches such as calibration might be unreliable in capturing uncertainty in complex reasoning. We instead frame abstention as a teachable skill and introduce a pipeline that couples Chain-of-Thought (CoT) supervision with Reinforcement Learning (RL) guided by abstention-aware rewards. Our goal is to systematically analyze how different information types and training techniques affect temporal reasoning with abstention behavior in LLMs. Through extensive experiments studying various methods, we find that RL yields strong empirical gains on ...

Originally published on March 05, 2026. Curated by AI News.

Related Articles

[2603.29171] Segmentation of Gray Matters and White Matters from Brain MRI data
Llms

[2603.29171] Segmentation of Gray Matters and White Matters from Brain MRI data

Abstract page for arXiv paper 2603.29171: Segmentation of Gray Matters and White Matters from Brain MRI data

arXiv - Machine Learning · 4 min ·
[2602.09924] LLMs Encode Their Failures: Predicting Success from Pre-Generation Activations
Llms

[2602.09924] LLMs Encode Their Failures: Predicting Success from Pre-Generation Activations

Abstract page for arXiv paper 2602.09924: LLMs Encode Their Failures: Predicting Success from Pre-Generation Activations

arXiv - Machine Learning · 3 min ·
[2602.01528] Making Bias Non-Predictive: Training Robust LLM Reasoning via Reinforcement Learning
Llms

[2602.01528] Making Bias Non-Predictive: Training Robust LLM Reasoning via Reinforcement Learning

Abstract page for arXiv paper 2602.01528: Making Bias Non-Predictive: Training Robust LLM Reasoning via Reinforcement Learning

arXiv - Machine Learning · 4 min ·
[2601.22783] Compact Hypercube Embeddings for Fast Text-based Wildlife Observation Retrieval
Llms

[2601.22783] Compact Hypercube Embeddings for Fast Text-based Wildlife Observation Retrieval

Abstract page for arXiv paper 2601.22783: Compact Hypercube Embeddings for Fast Text-based Wildlife Observation Retrieval

arXiv - Machine Learning · 4 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime