[2505.20934] NatADiff: Adversarial Boundary Guidance for Natural Adversarial Diffusion
About this article
Abstract page for arXiv paper 2505.20934: NatADiff: Adversarial Boundary Guidance for Natural Adversarial Diffusion
Computer Science > Machine Learning arXiv:2505.20934 (cs) [Submitted on 27 May 2025 (v1), last revised 3 Mar 2026 (this version, v2)] Title:NatADiff: Adversarial Boundary Guidance for Natural Adversarial Diffusion Authors:Max Collins, Jordan Vice, Tim French, Ajmal Mian View a PDF of the paper titled NatADiff: Adversarial Boundary Guidance for Natural Adversarial Diffusion, by Max Collins and 3 other authors View PDF Abstract:Adversarial samples exploit irregularities in the manifold `learned' by deep learning models to cause misclassifications. The study of these adversarial samples provides insight into the features a model uses to classify inputs, which can be leveraged to improve robustness against future attacks. However, much of the existing literature focuses on constrained adversarial samples, which do not accurately reflect test-time errors encountered in real-world settings. To address this, we propose `NatADiff', an adversarial sampling scheme that leverages denoising diffusion to generate natural adversarial samples. Our approach is based on the observation that natural adversarial samples frequently contain structural elements from the adversarial class. Deep learning models can exploit these structural elements to shortcut the classification process, rather than learning to genuinely distinguish between classes. To leverage this behavior, we guide the diffusion trajectory towards the intersection of the true and adversarial classes, combining time-travel samp...