[2512.00272] WARP: Weight Teleportation for Attack-Resilient Unlearning Protocols
About this article
Abstract page for arXiv paper 2512.00272: WARP: Weight Teleportation for Attack-Resilient Unlearning Protocols
Computer Science > Machine Learning arXiv:2512.00272 (cs) [Submitted on 29 Nov 2025 (v1), last revised 2 Mar 2026 (this version, v2)] Title:WARP: Weight Teleportation for Attack-Resilient Unlearning Protocols Authors:Mohammad M Maheri, Xavier Cadet, Peter Chin, Hamed Haddadi View a PDF of the paper titled WARP: Weight Teleportation for Attack-Resilient Unlearning Protocols, by Mohammad M Maheri and 3 other authors View PDF HTML (experimental) Abstract:Approximate machine unlearning aims to efficiently remove the influence of specific data points from a trained model, offering a practical alternative to full retraining. However, it introduces privacy risks: an adversary with access to pre- and post-unlearning models can exploit their differences for membership inference or data reconstruction. We show these vulnerabilities arise from two factors: large gradient norms of forget-set samples and the close proximity of unlearned parameters to the original model. To demonstrate their severity, we propose unlearning-specific membership inference and reconstruction attacks, showing that several state-of-the-art methods (e.g., NGP, SCRUB) remain vulnerable. To mitigate this leakage, we introduce WARP, a plug-and-play teleportation defense that leverages neural network symmetries to reduce forget-set gradient energy and increase parameter dispersion while preserving predictions. This reparameterization obfuscates the signal of forgotten data, making it harder for attackers to distin...