Machine Learning Robotics Ai Infrastructure Computer Vision

[2602.13710] HBVLA: Pushing 1-Bit Post-Training Quantization for Vision-Language-Action Models

arXiv - Machine Learning February 17, 2026 4 min read Article

Summary

The paper presents HBVLA, a framework for 1-bit post-training quantization of Vision-Language-Action models, enhancing efficiency while maintaining performance on resource-constrained devices.

Why It Matters

As AI models become increasingly complex, their deployment on limited hardware is a significant challenge. The HBVLA framework addresses this by allowing for efficient quantization without substantial performance loss, making advanced AI applications more accessible in real-world scenarios, especially in robotics.

Key Takeaways

HBVLA improves the efficiency of Vision-Language-Action models through 1-bit quantization.
The framework retains high performance, with quantized models achieving over 92% of full-precision performance.
Utilizes a policy-aware enhanced Hessian to identify critical weights for action generation.
Demonstrates robust deployability on hardware-limited platforms, crucial for robotics.
Provides a practical foundation for ultra-low-bit quantization, expanding the usability of AI models.

Computer Science > Machine Learning arXiv:2602.13710 (cs) [Submitted on 14 Feb 2026] Title:HBVLA: Pushing 1-Bit Post-Training Quantization for Vision-Language-Action Models Authors:Xin Yan, Zhenglin Wan, Feiyang Ye, Xingrui Yu, Hangyu Du, Yang You, Ivor Tsang View a PDF of the paper titled HBVLA: Pushing 1-Bit Post-Training Quantization for Vision-Language-Action Models, by Xin Yan and 6 other authors View PDF HTML (experimental) Abstract:Vision-Language-Action (VLA) models enable instruction-following embodied control, but their large compute and memory footprints hinder deployment on resource-constrained robots and edge platforms. While reducing weights to 1-bit precision through binarization can greatly improve efficiency, existing methods fail to narrow the distribution gap between binarized and full-precision weights, causing quantization errors to accumulate under long-horizon closed-loop execution and severely degrade actions. To fill this gap, we propose HBVLA, a VLA-tailored binarization framework. First, we use a policy-aware enhanced Hessian to identify weights that are truly critical for action generation. Then, we employ a sparse orthogonal transform for non-salient weights to induce a low-entropy intermediate state. Finally, we quantize both salient and non-salient weights in the Harr domain with group-wise 1-bit quantization. We have evaluated our approach on different VLAs: on LIBERO, quantized OpenVLA-OFT retains 92.2% of full-precision performance; on Sim...

Read Original Article

[2602.13710] HBVLA: Pushing 1-Bit Post-Training Quantization for Vision-Language-Action Models

Summary

Why It Matters

Key Takeaways

Related Articles

[R] Fine-tuning services report

[D] Does ML have a "bible"/reference textbook at the Intermediate/Advanced level?

[D] ICML 2026 review policy debate: 100 responses suggest Policy B may score higher, while Policy A shows higher confidence

Nomadic raises $8.4 million to wrangle the data pouring off autonomous vehicles | TechCrunch

No comments

Stay updated with AI News