[2602.07849] LQA: A Lightweight Quantized-Adaptive Framework for Vision-Language Models on the Edge

[2602.07849] LQA: A Lightweight Quantized-Adaptive Framework for Vision-Language Models on the Edge

arXiv - AI 3 min read Article

Summary

The paper presents LQA, a lightweight quantized-adaptive framework designed to enhance the deployment of Vision-Language Models (VLMs) on edge devices, addressing resource constraints and performance issues.

Why It Matters

As edge computing becomes increasingly vital for AI applications, optimizing Vision-Language Models for resource-constrained environments is essential. LQA provides a practical solution that balances efficiency and performance, making advanced AI more accessible on everyday devices.

Key Takeaways

  • LQA combines modality-aware quantization with gradient-free test-time adaptation.
  • The framework improves adaptation performance by 4.5% while reducing memory usage significantly.
  • LQA outperforms traditional gradient-based methods, achieving up to 19.9x lower memory usage.
  • The approach is designed for robust and efficient deployment of VLMs on edge devices.
  • LQA supports privacy-preserving AI applications by minimizing resource demands.

Computer Science > Artificial Intelligence arXiv:2602.07849 (cs) [Submitted on 8 Feb 2026 (v1), last revised 16 Feb 2026 (this version, v2)] Title:LQA: A Lightweight Quantized-Adaptive Framework for Vision-Language Models on the Edge Authors:Xin Wang, Hualin Zhou, Sheng Guang Wang, Ting Dang, Yu Zhang, Hong Jia, Tao Gu View a PDF of the paper titled LQA: A Lightweight Quantized-Adaptive Framework for Vision-Language Models on the Edge, by Xin Wang and 6 other authors View PDF HTML (experimental) Abstract:Deploying Vision-Language Models (VLMs) on edge devices is challenged by resource constraints and performance degradation under distribution shifts. While test-time adaptation (TTA) can counteract such shifts, existing methods are too resource-intensive for on-device deployment. To address this challenge, we propose LQA, a lightweight, quantized-adaptive framework for VLMs that combines a modality-aware quantization strategy with gradient-free test-time adaptation. We introduce Selective Hybrid Quantization (SHQ) and a quantized, gradient-free adaptation mechanism to enable robust and efficient VLM deployment on resource-constrained hardware. Experiments across both synthetic and real-world distribution shifts show that LQA improves overall adaptation performance by 4.5\%, uses less memory than full-precision models, and significantly outperforms gradient-based TTA methods, achieving up to 19.9$\times$ lower memory usage across seven open-source datasets. These results dem...

Related Articles

Llms

My AI spent last night modifying its own codebase

I've been working on a local AI system called Apis that runs completely offline through Ollama. During a background run, Apis identified ...

Reddit - Artificial Intelligence · 1 min ·
Llms

Fake users generated by AI can't simulate humans — review of 182 research papers. Your thoughts?

https://www.researchsquare.com/article/rs-9057643/v1 There’s a massive trend right now where tech companies, businesses, even researchers...

Reddit - Artificial Intelligence · 1 min ·
Llms

Depth-first pruning seems to transfer from GPT-2 to Llama (unexpectedly well)

TL;DR: Removing the right transformer layers (instead of shrinking all layers) gives smaller, faster models with minimal quality loss — a...

Reddit - Artificial Intelligence · 1 min ·
[2603.23966] Policy-Guided Threat Hunting: An LLM enabled Framework with Splunk SOC Triage
Llms

[2603.23966] Policy-Guided Threat Hunting: An LLM enabled Framework with Splunk SOC Triage

Abstract page for arXiv paper 2603.23966: Policy-Guided Threat Hunting: An LLM enabled Framework with Splunk SOC Triage

arXiv - AI · 4 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime