[2602.13151] Quantization-Robust LLM Unlearning via Low-Rank Adaptation

[2602.13151] Quantization-Robust LLM Unlearning via Low-Rank Adaptation

arXiv - Machine Learning 4 min read Article

Summary

The paper presents a method for unlearning knowledge in large language models (LLMs) while maintaining performance after quantization, using low-rank adaptation (LoRA) to ensure effective updates are preserved.

Why It Matters

As LLMs become integral to various applications, the ability to efficiently remove sensitive information while maintaining model performance is crucial. This research addresses the challenge of quantization, which can hinder unlearning processes, thereby enhancing privacy and utility in real-world deployments.

Key Takeaways

  • Low-rank adaptation (LoRA) allows for effective unlearning in quantized LLMs.
  • Standard fine-tuning may not survive aggressive quantization, risking model integrity.
  • LoRA improves model utility and reduces privacy leakage in quantized environments.
  • The proposed method is beneficial for scenarios requiring both unlearning and quantization.
  • Performance metrics show significant improvements in utility and privacy protection.

Computer Science > Machine Learning arXiv:2602.13151 (cs) [Submitted on 13 Feb 2026] Title:Quantization-Robust LLM Unlearning via Low-Rank Adaptation Authors:João Vitor Boer Abitante, Joana Meneguzzo Pasquali, Luan Fonseca Garcia, Ewerton de Oliveira, Thomas da Silva Paula, Rodrigo C. Barros, Lucas S. Kupssinskü View a PDF of the paper titled Quantization-Robust LLM Unlearning via Low-Rank Adaptation, by Jo\~ao Vitor Boer Abitante and 6 other authors View PDF HTML (experimental) Abstract:Large Language Model (LLM) unlearning aims to remove targeted knowledge from a trained model, but practical deployments often require post-training quantization (PTQ) for efficient inference. However, aggressive low-bit PTQ can mask or erase unlearning updates, causing quantized models to revert to pre-unlearning behavior. We show that standard full-parameter fine-tuning often induce parameter changes that are too small to survive 4-bit quantization. We propose quantization-robust unlearning via low-rank adaptation (LoRA): we freeze the base model and concentrate unlearning into trainable adapters so that the effective update is preserved after quantization. On Llama-2-7B evaluated with MUSE dataset (BOOKS and NEWS), LoRA improves 4-bit utility by up to 7.93 points (NPO+GDR on BOOKS: 50.17 to 58.10) and yields higher 4-bit utility on NEWS for GA+GDR (40.06 to 44.82, increase of 4.76). LoRA also substantially reduces privacy leakage under 4-bit PTQ, e.g., for GA+KLR on BOOKS, PrivLeak moves...

Related Articles

Llms

AWS and Anthropic Advancing AI-powered Cybersecurity With Claude Mythos

AI News - General · 1 min ·
Gemini gets notebooks to help you organize projects | The Verge
Llms

Gemini gets notebooks to help you organize projects | The Verge

Google’s Gemini is getting a feature called “notebooks” to help you organize things about certain topics in a single place while using th...

The Verge - AI · 3 min ·
Anthropic Supply-Chain Risk Label Should Stay in Place, Appeals Court Says | WIRED
Llms

Anthropic Supply-Chain Risk Label Should Stay in Place, Appeals Court Says | WIRED

The AI company now faces conflicting rulings in its fight over how Claude can be used by the US military.

Wired - AI · 6 min ·
Tubi is the first streamer to launch a native app within ChatGPT | TechCrunch
Llms

Tubi is the first streamer to launch a native app within ChatGPT | TechCrunch

Tubi becomes the first streaming service to offer an app integration within ChatGPT, the AI chatbot that millions of users turn to for an...

TechCrunch - AI · 3 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime