[2602.13073] LCSB: Layer-Cyclic Selective Backpropagation for Memory-Efficient On-Device LLM Fine-Tuning
Summary
The paper presents Layer-Cyclic Selective Backpropagation (LCSB), a method for memory-efficient fine-tuning of large language models (LLMs) on mobile devices, achieving significant speedups with minimal quality loss.
Why It Matters
As mobile devices increasingly utilize large language models, optimizing memory and computational efficiency is crucial. LCSB offers a novel approach to enhance performance without compromising model quality, making advanced AI more accessible on resource-constrained devices.
Key Takeaways
- LCSB computes gradients for a subset of layers, reducing computational load.
- Achieves up to 1.40x speedup with less than 2% quality degradation.
- Demonstrates improved stability in 4-bit quantized settings compared to full backpropagation.
Computer Science > Machine Learning arXiv:2602.13073 (cs) [Submitted on 13 Feb 2026] Title:LCSB: Layer-Cyclic Selective Backpropagation for Memory-Efficient On-Device LLM Fine-Tuning Authors:Juneyoung Park, Eunbeen Yoon, Seongwan Kim. Jaeho Lee View a PDF of the paper titled LCSB: Layer-Cyclic Selective Backpropagation for Memory-Efficient On-Device LLM Fine-Tuning, by Juneyoung Park and 2 other authors View PDF HTML (experimental) Abstract:Memory-efficient backpropagation (MeBP) has enabled first-order fine-tuning of large language models (LLMs) on mobile devices with less than 1GB memory. However, MeBP requires backward computation through all transformer layers at every step, where weight decompression alone accounts for 32--42% of backward time. We propose Layer-Cyclic Selective Backpropagation (LCSB), which computes gradients for only a subset of layers per step. Our key insight is that residual connections guarantee gradient flow through identity paths, while AdamW momentum provides implicit updates for non-selected layers. We interpret LCSB as Block Coordinate Descent on the LoRA parameter space, providing theoretical justification for convergence. LCSB achieves up to 1.40$\times$ speedup with less than 2\% quality degradation across five models and three tasks. Surprisingly, in 4-bit quantized settings, LCSB exhibits superior stability: a 3B model that completely diverges under full backpropagation converges smoothly with LCSB, suggesting an implicit regularization...