[2506.01897] MLorc: Momentum Low-rank Compression for Memory Efficient Large Language Model Adaptation
About this article
Abstract page for arXiv paper 2506.01897: MLorc: Momentum Low-rank Compression for Memory Efficient Large Language Model Adaptation
Computer Science > Machine Learning arXiv:2506.01897 (cs) [Submitted on 2 Jun 2025 (v1), last revised 6 Apr 2026 (this version, v4)] Title:MLorc: Momentum Low-rank Compression for Memory Efficient Large Language Model Adaptation Authors:Wei Shen, Zhang Yaxiang, Minhui Huang, Mengfan Xu, Jiawei Zhang, Cong Shen View a PDF of the paper titled MLorc: Momentum Low-rank Compression for Memory Efficient Large Language Model Adaptation, by Wei Shen and 5 other authors View PDF HTML (experimental) Abstract:With increasing size of large language models (LLMs), full-parameter fine-tuning imposes substantial memory demands. To alleviate this, we propose a novel memory-efficient training paradigm called Momentum Low-rank compression (MLorc). The key idea of MLorc is to compress and reconstruct the momentum of matrix parameters during training to reduce memory consumption. Compared to LoRA, MLorc avoids enforcing a fixed-rank constraint on weight update matrices and thus enables full-parameter learning. Compared to GaLore, MLorc directly compress the momentum rather than gradients, thereby better preserving the training dynamics of full-parameter fine-tuning. We provide a theoretical guarantee for its convergence under mild assumptions. Empirically, MLorc consistently outperforms other memory-efficient training methods, matches or even exceeds the performance of full fine-tuning at small ranks (e.g., $r=4$), and generalizes well across different optimizers, all while not compromising t...