[2603.00436] ROKA: Robust Knowledge Unlearning against Adversaries
About this article
Abstract page for arXiv paper 2603.00436: ROKA: Robust Knowledge Unlearning against Adversaries
Computer Science > Machine Learning arXiv:2603.00436 (cs) [Submitted on 28 Feb 2026] Title:ROKA: Robust Knowledge Unlearning against Adversaries Authors:Jinmyeong Shin, Joshua Tapia, Nicholas Ferreira, Gabriel Diaz, Moayed Daneshyari, Hyeran Jeon View a PDF of the paper titled ROKA: Robust Knowledge Unlearning against Adversaries, by Jinmyeong Shin and 5 other authors View PDF HTML (experimental) Abstract:The need for machine unlearning is critical for data privacy, yet existing methods often cause Knowledge Contamination by unintentionally damaging related knowledge. Such a degraded model performance after unlearning has been recently leveraged for new inference and backdoor attacks. Most studies design adversarial unlearning requests that require poisoning or duplicating training data. In this study, we introduce a new unlearning-induced attack model, namely indirect unlearning attack, which does not require data manipulation but exploits the consequence of knowledge contamination to perturb the model accuracy on security-critical predictions. To mitigate this attack, we introduce a theoretical framework that models neural networks as Neural Knowledge Systems. Based on this, we propose ROKA, a robust unlearning strategy centered on Neural Healing. Unlike conventional unlearning methods that only destroy information, ROKA constructively rebalances the model by nullifying the influence of forgotten data while strengthening its conceptual neighbors. To the best of our knowl...