[2603.04162] Bielik-Q2-Sharp: A Comparative Study of Extreme 2-bit

[2603.04162] Bielik-Q2-Sharp: A Comparative Study of Extreme 2-bit Quantization Methods for a Polish 11B Language Model

arXiv - AI March 05, 2026 3 min read

About this article

Abstract page for arXiv paper 2603.04162: Bielik-Q2-Sharp: A Comparative Study of Extreme 2-bit Quantization Methods for a Polish 11B Language Model

Computer Science > Computation and Language arXiv:2603.04162 (cs) [Submitted on 4 Mar 2026] Title:Bielik-Q2-Sharp: A Comparative Study of Extreme 2-bit Quantization Methods for a Polish 11B Language Model Authors:Jakub Prejzner View a PDF of the paper titled Bielik-Q2-Sharp: A Comparative Study of Extreme 2-bit Quantization Methods for a Polish 11B Language Model, by Jakub Prejzner View PDF HTML (experimental) Abstract:We present Bielik-Q2-Sharp, the first systematic academic evaluation of extreme 2-bit quantization applied to a Polish large language model. Using Bielik-11B-v2.3-Instruct (11B parameters, Mistral architecture) as our base model, we compare six state-of-the-art post-training quantization methods -- QuIP#, SpinQuant+GPTQ, ButterflyQuant, QTIP, VPTQ, and AQLM -- all calibrated on a Polish-language corpus (CulturaX-PL) with shared Hessian matrices. Our best variant (QuIP# E8P12) achieves 71.92% across 22 Polish benchmarks versus 72.07% for the IQ2_XXS baseline -- within statistical noise, at a modest size premium (3.26 GB vs. ~2.6 GB). On eq_bench, our method scores 47.14 versus 43.53 (+3.6pp), suggesting superior preservation of higher-order reasoning. QTIP achieves the best per-bit efficiency (79.4% MC acc_norm at ~2.4 bpw, 3.27 GB), matching VPTQ's quality at 35% smaller size. We additionally document a MC-generation dissociation phenomenon where rotation-based methods preserve log-likelihood quality but fail catastrophically at autoregressive generation. Th...

Originally published on March 05, 2026. Curated by AI News.

Llms

[P] ClaudeFormer: Building a Transformer Out of Claudes — Collaboration Request

I'm looking to work with people interested in math, machine learning, or agentic coding, on creating a multi-agent framework to do fronti...

Reddit - Machine Learning · 1 min · about 1 hour ago

Llms

I Asked ChatGPT 500 Questions. Here Are the Ads I Saw Most Often | WIRED

Ads are rolling out across the US on ChatGPT’s free tier. I asked OpenAI's bot 500 questions to see what these ads were like and how they...

Wired - AI · 9 min · about 3 hours ago

Llms

Abacus.Ai Claw LLM consumes an incredible amount of credit without any usage :(

Three days ago, I clicked the "Deploy OpenClaw In Seconds" button to get an overview of the new service, but I didn't build any automatio...

Reddit - Artificial Intelligence · 1 min · about 3 hours ago

Llms

Google’s Gemini AI app debuts in Hong Kong

Tech giant’s chatbot service tops Apple’s app store chart in the city.

AI Tools & Products · 2 min · about 5 hours ago

[2603.04162] Bielik-Q2-Sharp: A Comparative Study of Extreme 2-bit Quantization Methods for a Polish 11B Language Model

About this article

Related Articles

[P] ClaudeFormer: Building a Transformer Out of Claudes — Collaboration Request

I Asked ChatGPT 500 Questions. Here Are the Ads I Saw Most Often | WIRED

Abacus.Ai Claw LLM consumes an incredible amount of credit without any usage :(

Google’s Gemini AI app debuts in Hong Kong

No comments

Stay updated with AI News