Llms Machine Learning Ai Safety Generative Ai

[2510.04398] SECA: Semantically Equivalent and Coherent Attacks for Eliciting LLM Hallucinations

arXiv - Machine Learning February 17, 2026 4 min read Article

Summary

The paper presents SECA, a method for eliciting hallucinations in large language models (LLMs) through semantically equivalent and coherent prompt modifications, addressing limitations of previous adversarial techniques.

Why It Matters

As LLMs are increasingly used in critical applications, understanding their vulnerabilities is essential. SECA provides a novel approach to identify realistic adversarial prompts, which can help improve the reliability and safety of LLMs in real-world scenarios.

Key Takeaways

SECA formulates the problem of eliciting hallucinations as a constrained optimization task.
The method achieves higher success rates in prompting hallucinations while maintaining semantic coherence.
SECA highlights the vulnerability of both open-source and commercial LLMs to realistic prompt variations.

Computer Science > Computation and Language arXiv:2510.04398 (cs) [Submitted on 5 Oct 2025 (v1), last revised 15 Feb 2026 (this version, v3)] Title:SECA: Semantically Equivalent and Coherent Attacks for Eliciting LLM Hallucinations Authors:Buyun Liang, Liangzu Peng, Jinqi Luo, Darshan Thaker, Kwan Ho Ryan Chan, René Vidal View a PDF of the paper titled SECA: Semantically Equivalent and Coherent Attacks for Eliciting LLM Hallucinations, by Buyun Liang and 5 other authors View PDF Abstract:Large Language Models (LLMs) are increasingly deployed in high-risk domains. However, state-of-the-art LLMs often exhibit hallucinations, raising serious concerns about their reliability. Prior work has explored adversarial attacks to elicit hallucinations in LLMs, but these methods often rely on unrealistic prompts, either by inserting nonsensical tokens or by altering the original semantic intent. Consequently, such approaches provide limited insight into how hallucinations arise in real-world settings. In contrast, adversarial attacks in computer vision typically involve realistic modifications to input images. However, the problem of identifying realistic adversarial prompts for eliciting LLM hallucinations remains largely underexplored. To address this gap, we propose Semantically Equivalent and Coherent Attacks (SECA), which elicit hallucinations via realistic modifications to the prompt that preserve its meaning while maintaining semantic coherence. Our contributions are threefold: ...

Read Original Article

[2510.04398] SECA: Semantically Equivalent and Coherent Attacks for Eliciting LLM Hallucinations

Summary

Why It Matters

Key Takeaways

Related Articles

[R] Hybrid attention for small code models: 50x faster inference, but data scaling still dominates

[R] Agentic AI and Occupational Displacement: A Multi-Regional Task Exposure Analysis (236 occupations, 5 US metros)

Attention Is All You Need, But All You Can't Afford | Hybrid Attention

The “Agony” or ChatGPT: Would You Let AI Write Your Wedding Speech?

No comments

Stay updated with AI News