[2603.22324] DAQ: Delta-Aware Quantization for Post-Training LLM

[2603.22324] DAQ: Delta-Aware Quantization for Post-Training LLM Weight Compression

arXiv - AI March 25, 2026 3 min read

About this article

Abstract page for arXiv paper 2603.22324: DAQ: Delta-Aware Quantization for Post-Training LLM Weight Compression

Computer Science > Machine Learning arXiv:2603.22324 (cs) [Submitted on 20 Mar 2026] Title:DAQ: Delta-Aware Quantization for Post-Training LLM Weight Compression Authors:Xiaoming Yu, Shize Tang, Guanghua Yu, Linchuan Xie, Song Liu, Jianchen Zhu, Feng Li View a PDF of the paper titled DAQ: Delta-Aware Quantization for Post-Training LLM Weight Compression, by Xiaoming Yu and 6 other authors View PDF HTML (experimental) Abstract:We introduce Delta-Aware Quantization (DAQ), a data-free post-training quantization framework that preserves the knowledge acquired during post-training. Standard quantization objectives minimize reconstruction error but are agnostic to the base model, allowing quantization noise to disproportionately corrupt the small-magnitude parameter deltas ($\Delta W$) that encode post-training behavior -- an effect we analyze through the lens of quantization as implicit regularization. DAQ replaces reconstruction-based objectives with two delta-aware metrics -- Sign Preservation Rate and Cosine Similarity -- that directly optimize for directional fidelity of $\Delta W$, requiring only the base and post-trained weight matrices. In a pilot FP8 study, DAQ recovers style-specific capabilities lost under standard quantization while maintaining general performance. Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI) Cite as: arXiv:2603.22324 [cs.LG] (or arXiv:2603.22324v1 [cs.LG] for this version) https://doi.org/10.48550/arXiv.2603.22324 Focus to...

Originally published on March 25, 2026. Curated by AI News.

Llms

An attack class that passes every current LLM filter - no payload, no injection signature, no log trace

https://shapingrooms.com/research I published a paper today on something I've been calling postural manipulation. The short version: ordi...

Reddit - Artificial Intelligence · 1 min · about 2 hours ago

Llms

[R] An attack class that passes every current LLM filter - no payload, no injection signature, no log trace

https://shapingrooms.com/research I've been documenting what I'm calling postural manipulation: a specific class of language that install...

Reddit - Machine Learning · 1 min · about 2 hours ago

Llms

There are more AI health tools than ever—but how well do they work? | MIT Technology Review

Earlier this month, Microsoft launched Copilot Health, a new space within its Copilot app where users will be able to connect their medic...

MIT Technology Review · 11 min · about 2 hours ago

Llms

What does Gemini think of you?

I noticed that Gemini was referring back to a lot of queries I've made in the past and was using that knowledge to drive follow up prompt...

Reddit - Artificial Intelligence · 1 min · about 4 hours ago

[2603.22324] DAQ: Delta-Aware Quantization for Post-Training LLM Weight Compression

About this article

Related Articles

An attack class that passes every current LLM filter - no payload, no injection signature, no log trace

[R] An attack class that passes every current LLM filter - no payload, no injection signature, no log trace

There are more AI health tools than ever—but how well do they work? | MIT Technology Review

What does Gemini think of you?

No comments

Stay updated with AI News