[2510.04398] SECA: Semantically Equivalent and Coherent Attacks for Eliciting LLM Hallucinations
Summary
The paper presents SECA, a method for eliciting hallucinations in large language models (LLMs) through semantically equivalent and coherent prompt modifications, addressing limitations of previous adversarial techniques.
Why It Matters
As LLMs are increasingly used in critical applications, understanding their vulnerabilities is essential. SECA provides a novel approach to identify realistic adversarial prompts, which can help improve the reliability and safety of LLMs in real-world scenarios.
Key Takeaways
- SECA formulates the problem of eliciting hallucinations as a constrained optimization task.
- The method achieves higher success rates in prompting hallucinations while maintaining semantic coherence.
- SECA highlights the vulnerability of both open-source and commercial LLMs to realistic prompt variations.
Computer Science > Computation and Language arXiv:2510.04398 (cs) [Submitted on 5 Oct 2025 (v1), last revised 15 Feb 2026 (this version, v3)] Title:SECA: Semantically Equivalent and Coherent Attacks for Eliciting LLM Hallucinations Authors:Buyun Liang, Liangzu Peng, Jinqi Luo, Darshan Thaker, Kwan Ho Ryan Chan, René Vidal View a PDF of the paper titled SECA: Semantically Equivalent and Coherent Attacks for Eliciting LLM Hallucinations, by Buyun Liang and 5 other authors View PDF Abstract:Large Language Models (LLMs) are increasingly deployed in high-risk domains. However, state-of-the-art LLMs often exhibit hallucinations, raising serious concerns about their reliability. Prior work has explored adversarial attacks to elicit hallucinations in LLMs, but these methods often rely on unrealistic prompts, either by inserting nonsensical tokens or by altering the original semantic intent. Consequently, such approaches provide limited insight into how hallucinations arise in real-world settings. In contrast, adversarial attacks in computer vision typically involve realistic modifications to input images. However, the problem of identifying realistic adversarial prompts for eliciting LLM hallucinations remains largely underexplored. To address this gap, we propose Semantically Equivalent and Coherent Attacks (SECA), which elicit hallucinations via realistic modifications to the prompt that preserve its meaning while maintaining semantic coherence. Our contributions are threefold: ...