[2512.06443] Vec-LUT: Vector Table Lookup for Parallel Ultra-Low-Bit

[2512.06443] Vec-LUT: Vector Table Lookup for Parallel Ultra-Low-Bit LLM Inference on Edge Devices

arXiv - AI April 15, 2026 4 min read

About this article

Abstract page for arXiv paper 2512.06443: Vec-LUT: Vector Table Lookup for Parallel Ultra-Low-Bit LLM Inference on Edge Devices

Computer Science > Distributed, Parallel, and Cluster Computing arXiv:2512.06443 (cs) [Submitted on 6 Dec 2025 (v1), last revised 14 Apr 2026 (this version, v2)] Title:Vec-LUT: Vector Table Lookup for Parallel Ultra-Low-Bit LLM Inference on Edge Devices Authors:Xiangyu Li, Chengyu Yin, Weijun Wang, Jianyu Wei, Ting Cao, Yunxin Liu View a PDF of the paper titled Vec-LUT: Vector Table Lookup for Parallel Ultra-Low-Bit LLM Inference on Edge Devices, by Xiangyu Li and 5 other authors View PDF HTML (experimental) Abstract:Large language models (LLMs) are increasingly deployed on edge devices. To meet strict resource constraints, real-world deployment has pushed LLM quantization from 8-bit to 4-bit, 2-bit, and now 1.58-bit. Combined with lookup table (LUT)-based inference, CPUs run these ultra-low-bit LLMs even faster than NPUs, opening new opportunities for ubiquitous on-device intelligence. However, this paper identifies that LUT-based inference underutilizes memory bandwidth during parallel inference, which is required for prefilling, test-time scaling, and other multi-token scenarios. The root cause is the scalar LUT paradigm, which performs repetitive and non-contiguous memory accesses for each token. To solve the issue, we propose vector LUT, a new lookup paradigm that constructs a unified LUT across parallel tokens, and performs a single $1 \rightarrow N$ lookup per index. To realize it efficiently, we further introduce (1) Vector LUT-Centric Tensor Layout, and (2) Cache-...

Originally published on April 15, 2026. Curated by AI News.

Llms

I replaced ChatGPT with Google's offline AI on my phone for 24 hours — here's my verdict

Can AI finally stay on your phone? I tested Google’s offline AI app for 24 hours — and it completely changed how I think about privacy, e...

AI Tools & Products · 9 min · about 1 hour ago

Llms

OpenAI Launches GPT-5.4-Cyber with Expanded Access for Security Teams

GPT-5.4-Cyber launch expands defender access and helped fix 3,000+ vulnerabilities, strengthening proactive cybersecurity defenses.

AI Tools & Products · 5 min · about 1 hour ago

Llms

Anthropic Audaciously Hires A Psychiatrist To Psychologically Assess Claude Mythos AI

Anthropic has hired a psychiatrist to conduct psychological assessments of its Claude Mythos AI. Further context is not provided.

AI Tools & Products · 1 min · about 1 hour ago

Llms

OpenAI expands its cyber defense program with GPT-5.4-Cyber for vetted researchers

The company is scaling its Trusted Access for Cyber (TAC) program to thousands of verified individual defenders and hundreds of teams res...

Reddit - Artificial Intelligence · 1 min · about 2 hours ago

[2512.06443] Vec-LUT: Vector Table Lookup for Parallel Ultra-Low-Bit LLM Inference on Edge Devices

About this article

Related Articles

I replaced ChatGPT with Google's offline AI on my phone for 24 hours — here's my verdict

OpenAI Launches GPT-5.4-Cyber with Expanded Access for Security Teams

Anthropic Audaciously Hires A Psychiatrist To Psychologically Assess Claude Mythos AI

OpenAI expands its cyber defense program with GPT-5.4-Cyber for vetted researchers

No comments

Stay updated with AI News