[2506.03230] DiaBlo: Diagonal Blocks Are Sufficient For Finetuning
About this article
Abstract page for arXiv paper 2506.03230: DiaBlo: Diagonal Blocks Are Sufficient For Finetuning
Computer Science > Machine Learning arXiv:2506.03230 (cs) [Submitted on 3 Jun 2025 (v1), last revised 2 Mar 2026 (this version, v2)] Title:DiaBlo: Diagonal Blocks Are Sufficient For Finetuning Authors:Selcuk Gurses, Aozhong Zhang, Yanxia Deng, Xun Dong, Xin Li, Naigang Wang, Penghang Yin, Zi Yang View a PDF of the paper titled DiaBlo: Diagonal Blocks Are Sufficient For Finetuning, by Selcuk Gurses and 7 other authors View PDF HTML (experimental) Abstract:Fine-tuning is a critical step for adapting large language models (LLMs) to domain-specific downstream tasks. To mitigate the substantial computational and memory costs of full-model fine-tuning, Parameter-Efficient Fine-Tuning (PEFT) methods have been proposed to update only a small subset of model parameters. However, performance gaps between PEFT approaches and full-model fine-tuning still exist. In this work, we present DiaBlo, a simple yet effective PEFT approach that updates only the diagonal blocks of selected model weight matrices. Unlike Low-Rank Adaptation (LoRA) and its variants, DiaBlo eliminates the need for low-rank matrix products, thereby avoiding the reliance on auxiliary initialization schemes or customized optimization strategies to improve convergence. This design leads to stable and robust convergence while maintaining comparable memory efficiency and training speed to LoRA. Moreover, we provide theoretical guarantees showing that, under mild low-rank conditions, DiaBlo is more expressive than LoRA in ...