[2603.02781] Scores Know Bobs Voice: Speaker Impersonation Attack
About this article
Abstract page for arXiv paper 2603.02781: Scores Know Bobs Voice: Speaker Impersonation Attack
Computer Science > Cryptography and Security arXiv:2603.02781 (cs) [Submitted on 3 Mar 2026] Title:Scores Know Bobs Voice: Speaker Impersonation Attack Authors:Chanwoo Hwang, Sunpill Kim, Yong Kiam Tan, Tianchi Liu, Seunghun Paik, Dongsoo Kim, Mondal Soumik, Khin Mi Mi Aung, Jae Hong Seo View a PDF of the paper titled Scores Know Bobs Voice: Speaker Impersonation Attack, by Chanwoo Hwang and 7 other authors View PDF HTML (experimental) Abstract:Advances in deep learning have enabled the widespread deployment of speaker recognition systems (SRSs), yet they remain vulnerable to score-based impersonation attacks. Existing attacks that operate directly on raw waveforms require a large number of queries due to the difficulty of optimizing in high-dimensional audio spaces. Latent-space optimization within generative models offers improved efficiency, but these latent spaces are shaped by data distribution matching and do not inherently capture speaker-discriminative geometry. As a result, optimization trajectories often fail to align with the adversarial direction needed to maximize victim scores. To address this limitation, we propose an inversion-based generative attack framework that explicitly aligns the latent space of the synthesis model with the discriminative feature space of SRSs. We first analyze the requirements of an inverse model for score-based attacks and introduce a feature-aligned inversion strategy that geometrically synchronizes latent representations with spe...