[2603.20492] AE-LLM: Adaptive Efficiency Optimization for Large Language Models
About this article
Abstract page for arXiv paper 2603.20492: AE-LLM: Adaptive Efficiency Optimization for Large Language Models
Computer Science > Machine Learning arXiv:2603.20492 (cs) [Submitted on 20 Mar 2026] Title:AE-LLM: Adaptive Efficiency Optimization for Large Language Models Authors:Kaito Tanaka, Masato Ito, Yuji Nishimura, Keisuke Matsuda, Aya Nakayama View a PDF of the paper titled AE-LLM: Adaptive Efficiency Optimization for Large Language Models, by Kaito Tanaka and 4 other authors View PDF HTML (experimental) Abstract:Large Language Models (LLMs) have achieved remarkable success across diverse applications, yet their deployment remains challenging due to substantial computational costs, memory requirements, and energy consumption. Recent empirical studies have demonstrated that no single efficiency technique is universally optimal; instead, the effectiveness of methods such as efficient attention mechanisms, mixture-of-experts (MoE), parameter-efficient fine-tuning, and quantization varies significantly depending on task characteristics, resource constraints, and model scales. Building upon these insights, we propose AE-LLM, a unified framework that automatically selects and combines optimal efficiency techniques tailored to specific deployment scenarios. Our approach introduces a multi-objective optimization framework that jointly considers accuracy, latency, memory footprint, and energy consumption, while accounting for hardware constraints and task requirements. We develop an efficient search algorithm that explores the combinatorial space of efficiency techniques across architect...