[2603.01019] BadRSSD: Backdoor Attacks on Regularized Self-Supervised Diffusion Models
About this article
Abstract page for arXiv paper 2603.01019: BadRSSD: Backdoor Attacks on Regularized Self-Supervised Diffusion Models
Computer Science > Cryptography and Security arXiv:2603.01019 (cs) [Submitted on 1 Mar 2026] Title:BadRSSD: Backdoor Attacks on Regularized Self-Supervised Diffusion Models Authors:Jiayao Wang, Yiping Zhang, Mohammad Maruf Hasan, Xiaoying Lei, Jiale Zhang, Junwu Zhu, Qilin Wu, Dongfang Zhao View a PDF of the paper titled BadRSSD: Backdoor Attacks on Regularized Self-Supervised Diffusion Models, by Jiayao Wang and 7 other authors View PDF HTML (experimental) Abstract:Self-supervised diffusion models learn high-quality visual representations via latent space denoising. However, their representation layer poses a distinct threat: unlike traditional attacks targeting generative outputs, its unconstrained latent semantic space allows for stealthy backdoors, permitting malicious control upon triggering. In this paper, we propose BadRSSD, the first backdoor attack targeting the representation layer of self-supervised diffusion models. Specifically, it hijacks the semantic representations of poisoned samples with triggers in Principal Component Analysis (PCA) space toward those of a target image, then controls the denoising trajectory during diffusion by applying coordinated constraints across latent, pixel, and feature distribution spaces to steer the model toward generating the specified target. Additionally, we integrate representation dispersion regularization into the constraint framework to maintain feature space uniformity, significantly enhancing attack stealth. This appro...