[2603.04308] Activation Outliers in Transformer Quantization: Reproduction, Statistical Analysis, and Deployment Tradeoffs

[2603.04308] Activation Outliers in Transformer Quantization: Reproduction, Statistical Analysis, and Deployment Tradeoffs

arXiv - AI 4 min read

About this article

Abstract page for arXiv paper 2603.04308: Activation Outliers in Transformer Quantization: Reproduction, Statistical Analysis, and Deployment Tradeoffs

Computer Science > Machine Learning arXiv:2603.04308 (cs) [Submitted on 4 Mar 2026] Title:Activation Outliers in Transformer Quantization: Reproduction, Statistical Analysis, and Deployment Tradeoffs Authors:Pranav Kumar Kaliaperumal View a PDF of the paper titled Activation Outliers in Transformer Quantization: Reproduction, Statistical Analysis, and Deployment Tradeoffs, by Pranav Kumar Kaliaperumal View PDF HTML (experimental) Abstract:Post-training quantization (PTQ) of transformers is known to suffer from severe accuracy degradation due to structured activation outliers, as originally analyzed by Bondarenko et al. (EMNLP 2021) in work associated with Qualcomm AI Research. This paper provides a reproducible empirical reproduction and systems-level extension of that phenomenon in BERT-base fine-tuned on QNLI. When global W8A8 quantization is applied, validation accuracy drops sharply from 89.66% (FP32) to 54.33%, a decrease of 35.33 points. Statistical analysis of FP32 activations shows strongly heavy-tailed behavior that intensifies with model depth: kurtosis reaches 271 in the final layers and approximately 55% of activation energy is concentrated in the top 1% of channels. We evaluate several mitigation strategies. Mixed precision PTQ restores accuracy close to the FP32 baseline (89.42%). Per-embedding-group (PEG) quantization shows strong sensitivity to grouping structure, improving accuracy from 66.12% with three groups to 86.18% with four groups. In contrast, perc...

Originally published on March 05, 2026. Curated by AI News.

Related Articles

Hub Group Using AI, Machine Learning for Real-Time Visibility of Shipments
Machine Learning

Hub Group Using AI, Machine Learning for Real-Time Visibility of Shipments

AI Events · 4 min ·
Llms

Von Hammerstein’s Ghost: What a Prussian General’s Officer Typology Can Teach Us About AI Misalignment

Greetings all - I've posted mostly in r/claudecode and r/aigamedev a couple of times previously. Working with CC for personal projects re...

Reddit - Artificial Intelligence · 1 min ·
Llms

World models will be the next big thing, bye-bye LLMs

Was at Nvidia's GTC conference recently and honestly, it was one of the most eye-opening events I've attended in a while. There was a lot...

Reddit - Artificial Intelligence · 1 min ·
Machine Learning

[D] Got my first offer after months of searching — below posted range, contract-to-hire, and worried it may pause my search. Do I take it?

I could really use some outside perspective. I’m a senior ML/CV engineer in Canada with about 5–6 years across research and industry. Mas...

Reddit - Machine Learning · 1 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime