Llms Machine Learning Ai Safety

[2601.19245] Beyond In-Domain Detection: SpikeScore for Cross-Domain Hallucination Detection

arXiv - Machine Learning February 17, 2026 4 min read Article

Summary

This paper introduces SpikeScore, a novel method for detecting hallucinations in multi-turn dialogues across different domains, enhancing the reliability of large language models (LLMs).

Why It Matters

As LLMs are increasingly deployed in real-world applications, ensuring their reliability is critical. Current detection methods often fail in cross-domain scenarios. SpikeScore addresses this gap, potentially improving LLM performance and user trust in AI systems.

Key Takeaways

SpikeScore quantifies fluctuations in multi-turn dialogues to detect hallucinations.
The method shows improved cross-domain generalization compared to existing techniques.
The study highlights the importance of generalizable hallucination detection in AI applications.

Computer Science > Artificial Intelligence arXiv:2601.19245 (cs) [Submitted on 27 Jan 2026 (v1), last revised 15 Feb 2026 (this version, v4)] Title:Beyond In-Domain Detection: SpikeScore for Cross-Domain Hallucination Detection Authors:Yongxin Deng, Zhen Fang, Sharon Li, Ling Chen View a PDF of the paper titled Beyond In-Domain Detection: SpikeScore for Cross-Domain Hallucination Detection, by Yongxin Deng and 2 other authors View PDF HTML (experimental) Abstract:Hallucination detection is critical for deploying large language models (LLMs) in real-world applications. Existing hallucination detection methods achieve strong performance when the training and test data come from the same domain, but they suffer from poor cross-domain generalization. In this paper, we study an important yet overlooked problem, termed generalizable hallucination detection (GHD), which aims to train hallucination detectors on data from a single domain while ensuring robust performance across diverse related domains. In studying GHD, we simulate multi-turn dialogues following LLMs' initial response and observe an interesting phenomenon: hallucination-initiated multi-turn dialogues universally exhibit larger uncertainty fluctuations than factual ones across different domains. Based on the phenomenon, we propose a new score SpikeScore, which quantifies abrupt fluctuations in multi-turn dialogues. Through both theoretical analysis and empirical validation, we demonstrate that SpikeScore achieves str...

Read Original Article