[2508.07321] ObfusQAte: A Proposed Framework to Evaluate LLM Robustness on Obfuscated Factual Question Answering
About this article
Abstract page for arXiv paper 2508.07321: ObfusQAte: A Proposed Framework to Evaluate LLM Robustness on Obfuscated Factual Question Answering
Computer Science > Computation and Language arXiv:2508.07321 (cs) [Submitted on 10 Aug 2025 (v1), last revised 4 Mar 2026 (this version, v2)] Title:ObfusQAte: A Proposed Framework to Evaluate LLM Robustness on Obfuscated Factual Question Answering Authors:Shubhra Ghosh, Abhilekh Borah, Aditya Kumar Guru, Kripabandhu Ghosh View a PDF of the paper titled ObfusQAte: A Proposed Framework to Evaluate LLM Robustness on Obfuscated Factual Question Answering, by Shubhra Ghosh and 3 other authors View PDF HTML (experimental) Abstract:The rapid proliferation of Large Language Models (LLMs) has significantly contributed to the development of equitable AI systems capable of factual question-answering (QA). However, no known study tests the LLMs' robustness when presented with obfuscated versions of questions. To systematically evaluate these limitations, we propose a novel technique, ObfusQAte, and leveraging the same, introduce ObfusQA, a comprehensive, first-of-its-kind framework with multi-tiered obfuscation levels designed to examine LLM capabilities across three distinct dimensions: (i) Named-Entity Indirection, (ii) Distractor Indirection, and (iii) Contextual Overload. By capturing these fine-grained distinctions in language, ObfusQA provides a comprehensive benchmark for evaluating LLM robustness and adaptability. Our study observes that LLMs exhibit a tendency to fail or generate hallucinated responses when confronted with these increasingly nuanced variations. To foster rese...