[2604.03571] Selective Forgetting for Large Reasoning Models
About this article
Abstract page for arXiv paper 2604.03571: Selective Forgetting for Large Reasoning Models
Computer Science > Artificial Intelligence arXiv:2604.03571 (cs) [Submitted on 4 Apr 2026] Title:Selective Forgetting for Large Reasoning Models Authors:Tuan Le, Wei Qian, Mengdi Huai View a PDF of the paper titled Selective Forgetting for Large Reasoning Models, by Tuan Le and 2 other authors View PDF HTML (experimental) Abstract:Large Reasoning Models (LRMs) generate structured chains of thought (CoTs) before producing final answers, making them especially vulnerable to knowledge leakage through intermediate reasoning steps. Yet, the memorization of sensitive information in the training data such as copyrighted and private content has led to ethical and legal concerns. To address these issues, selective forgetting (also known as machine unlearning) has emerged as a potential remedy for LRMs. However, existing unlearning methods primarily target final answers and may degrade the overall reasoning ability of LRMs after forgetting. Additionally, directly applying unlearning on the entire CoTs could degrade the general reasoning capabilities. The key challenge for LRM unlearning lies in achieving precise unlearning of targeted knowledge while preserving the integrity of general reasoning capabilities. To bridge this gap, we in this paper propose a novel LRM unlearning framework that selectively removes sensitive reasoning components while preserving general reasoning capabilities. Our approach leverages multiple LLMs with retrieval-augmented generation (RAG) to analyze CoT t...