[2603.29535] Quantization with Unified Adaptive Distillation to enable

[2603.29535] Quantization with Unified Adaptive Distillation to enable multi-LoRA based one-for-all Generative Vision Models on edge

arXiv - AI April 01, 2026 4 min read

About this article

Abstract page for arXiv paper 2603.29535: Quantization with Unified Adaptive Distillation to enable multi-LoRA based one-for-all Generative Vision Models on edge

Computer Science > Computer Vision and Pattern Recognition arXiv:2603.29535 (cs) [Submitted on 31 Mar 2026] Title:Quantization with Unified Adaptive Distillation to enable multi-LoRA based one-for-all Generative Vision Models on edge Authors:Sowmya Vajrala, Aakash Parmar, Prasanna R, Sravanth Kodavanti, Manjunath Arveti, Srinivas Soumitri Miriyala, Ashok Senapati View a PDF of the paper titled Quantization with Unified Adaptive Distillation to enable multi-LoRA based one-for-all Generative Vision Models on edge, by Sowmya Vajrala and 6 other authors View PDF HTML (experimental) Abstract:Generative Artificial Intelligence (GenAI) features such as image editing, object removal, and prompt-guided image transformation are increasingly integrated into mobile applications. However, deploying Large Vision Models (LVMs) for such tasks on resource-constrained devices remains challenging due to their high memory and compute requirements. While Low-Rank Adapters (LoRAs) enable parameter-efficient task adaptation, existing Mobile deployment pipelines typically compile separate model binaries for each LoRA + a copy of the foundation model, resulting in redundant storage and increased runtime overhead. In this work, we present a unified framework for enabling multi-task GenAI inference on edge devices using a single shared model. Our key idea is to treat LoRA weights as runtime inputs rather than embedding them into the compiled model graph, allowing dynamic task switching at runtime wi...

Originally published on April 01, 2026. Curated by AI News.

Machine Learning

Improving AI models’ ability to explain their predictions

AI News - General · 9 min · 32 minutes ago

Machine Learning

[D] TMLR reviews seem more reliable than ICML/NeurIPS/ICLR

This year I submitted a paper to ICML for the first time. I have also experienced the review process at TMLR and ICLR. From my observatio...

Reddit - Machine Learning · 1 min · about 1 hour ago

Machine Learning

[D] icml, no rebuttal ack so far..

Almost all the papers I reviewed have received at least one ack, but I haven’t gotten a single rebuttal acknowledgment yet. Is there anyo...

Reddit - Machine Learning · 1 min · about 3 hours ago

Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min · about 3 hours ago

[2603.29535] Quantization with Unified Adaptive Distillation to enable multi-LoRA based one-for-all Generative Vision Models on edge

About this article

Related Articles

Improving AI models’ ability to explain their predictions

[D] TMLR reviews seem more reliable than ICML/NeurIPS/ICLR

[D] icml, no rebuttal ack so far..

UMKC Announces New Master of Science in Artificial Intelligence

No comments

Stay updated with AI News