[2602.14701] Unbiased Approximate Vector-Jacobian Products for Efficient Backpropagation
Summary
This paper presents methods to enhance the efficiency of backpropagation in deep learning by using unbiased approximate vector-Jacobian products, reducing computational and memory costs.
Why It Matters
As deep learning models grow in complexity, the demand for efficient training methods becomes critical. This research offers a novel approach to backpropagation, potentially leading to faster training times and lower resource consumption, which is vital for both academic research and industry applications.
Key Takeaways
- Introduces randomized, unbiased approximations for vector-Jacobian products during backpropagation.
- Theoretical analysis reveals a trade-off between epochs needed for precision and cost reduction.
- Identifies optimal unbiased estimates with minimal variance under sparsity constraints.
- Empirical validation on various neural network architectures demonstrates the approach's effectiveness.
- Potential to significantly reduce training costs for deep learning models.
Computer Science > Machine Learning arXiv:2602.14701 (cs) [Submitted on 16 Feb 2026] Title:Unbiased Approximate Vector-Jacobian Products for Efficient Backpropagation Authors:Killian Bakong (DI-ENS), Laurent Massoulié (Inria, ARGO, CMAP), Edouard Oyallon (MLIA), Kevin Scaman View a PDF of the paper titled Unbiased Approximate Vector-Jacobian Products for Efficient Backpropagation, by Killian Bakong (DI-ENS) and 5 other authors View PDF Abstract:In this work we introduce methods to reduce the computational and memory costs of training deep neural networks. Our approach consists in replacing exact vector-jacobian products by randomized, unbiased approximations thereof during backpropagation. We provide a theoretical analysis of the trade-off between the number of epochs needed to achieve a target precision and the cost reduction for each epoch. We then identify specific unbiased estimates of vector-jacobian products for which we establish desirable optimality properties of minimal variance under sparsity constraints. Finally we provide in-depth experiments on multi-layer perceptrons, BagNets and Visual Transfomers architectures. These validate our theoretical results, and confirm the potential of our proposed unbiased randomized backpropagation approach for reducing the cost of deep learning. Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML) Cite as: arXiv:2602.14701 [cs.LG] (or arXiv:2602.14701v1 [cs.LG] for this version) https://doi.org/10.48550/arXiv.2602...