[2604.04255] Towards Unveiling Vulnerabilities of Large Reasoning Models in Machine Unlearning

[2604.04255] Towards Unveiling Vulnerabilities of Large Reasoning Models in Machine Unlearning

arXiv - Machine Learning 4 min read

About this article

Abstract page for arXiv paper 2604.04255: Towards Unveiling Vulnerabilities of Large Reasoning Models in Machine Unlearning

Computer Science > Machine Learning arXiv:2604.04255 (cs) [Submitted on 5 Apr 2026] Title:Towards Unveiling Vulnerabilities of Large Reasoning Models in Machine Unlearning Authors:Aobo Chen, Chenxu Zhao, Chenglin Miao, Mengdi Huai View a PDF of the paper titled Towards Unveiling Vulnerabilities of Large Reasoning Models in Machine Unlearning, by Aobo Chen and 3 other authors View PDF HTML (experimental) Abstract:Large language models (LLMs) possess strong semantic understanding, driving significant progress in data mining applications. This is further enhanced by large reasoning models (LRMs), which provide explicit multi-step reasoning traces. On the other hand, the growing need for the right to be forgotten has driven the development of machine unlearning techniques, which aim to eliminate the influence of specific data from trained models without full retraining. However, unlearning may also introduce new security vulnerabilities by exposing additional interaction surfaces. Although many studies have investigated unlearning attacks, there is no prior work on LRMs. To bridge the gap, we first in this paper propose LRM unlearning attack that forces incorrect final answers while generating convincing but misleading reasoning traces. This objective is challenging due to non-differentiable logical constraints, weak optimization effect over long rationales, and discrete forget set selection. To overcome these challenges, we introduce a bi-level exact unlearning attack that in...

Originally published on April 07, 2026. Curated by AI News.

Related Articles

[2602.07238] Is there "Secret Sauce'' in Large Language Model Development?
Llms

[2602.07238] Is there "Secret Sauce'' in Large Language Model Development?

Abstract page for arXiv paper 2602.07238: Is there "Secret Sauce'' in Large Language Model Development?

arXiv - Machine Learning · 3 min ·
[2602.01203] Attention Sink Forges Native MoE in Attention Layers: Sink-Aware Training to Address Head Collapse
Llms

[2602.01203] Attention Sink Forges Native MoE in Attention Layers: Sink-Aware Training to Address Head Collapse

Abstract page for arXiv paper 2602.01203: Attention Sink Forges Native MoE in Attention Layers: Sink-Aware Training to Address Head Collapse

arXiv - Machine Learning · 4 min ·
[2601.01322] LinMU: Multimodal Understanding Made Linear
Llms

[2601.01322] LinMU: Multimodal Understanding Made Linear

Abstract page for arXiv paper 2601.01322: LinMU: Multimodal Understanding Made Linear

arXiv - Machine Learning · 4 min ·
[2512.05525] Poodle: Seamlessly Scaling Down Large Language Models with Just-in-Time Model Replacement
Llms

[2512.05525] Poodle: Seamlessly Scaling Down Large Language Models with Just-in-Time Model Replacement

Abstract page for arXiv paper 2512.05525: Poodle: Seamlessly Scaling Down Large Language Models with Just-in-Time Model Replacement

arXiv - Machine Learning · 4 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime