[2603.23269] Not All Tokens Are Created Equal: Query-Efficient Jailbreak Fuzzing for LLMs
About this article
Abstract page for arXiv paper 2603.23269: Not All Tokens Are Created Equal: Query-Efficient Jailbreak Fuzzing for LLMs
Computer Science > Cryptography and Security arXiv:2603.23269 (cs) [Submitted on 24 Mar 2026] Title:Not All Tokens Are Created Equal: Query-Efficient Jailbreak Fuzzing for LLMs Authors:Wenyu Chen, Xiangtao Meng, Chuanchao Zang, Li Wang, Xinyu Gao, Jianing Wang, Peng Zhan, Zheng Li, Shanqing Guo View a PDF of the paper titled Not All Tokens Are Created Equal: Query-Efficient Jailbreak Fuzzing for LLMs, by Wenyu Chen and 8 other authors View PDF HTML (experimental) Abstract:Large Language Models(LLMs) are widely deployed, yet are vulnerable to jailbreak prompts that elicit policy-violating outputs. Although prior studies have uncovered these risks, they typically treat all tokens as equally important during prompt mutation, overlooking the varying contributions of individual tokens to triggering model refusals. Consequently, these attacks introduce substantial redundant searching under query-constrained scenarios, reducing attack efficiency and hindering comprehensive vulnerability assessment. In this work, we conduct a token-level analysis of refusal behavior and observe that token contributions are highly skewed rather than uniform. Moreover, we find strong cross-model consistency in refusal tendencies, enabling the use of a surrogate model to estimate token-level contributions to the target model's refusals. Motivated by these findings, we propose TriageFuzz, a token-aware jailbreak fuzzing framework that adapts the fuzz testing approach with a series of customized design...