Machine Learning Nlp Ai Infrastructure

[2602.17510] LORA-CRAFT: Cross-layer Rank Adaptation via Frozen Tucker Decomposition of Pre-trained Attention Weights

arXiv - AI February 20, 2026 4 min read Article

Summary

The paper presents LORA-CRAFT, a novel parameter-efficient fine-tuning method that utilizes Tucker tensor decomposition on pre-trained attention weights, achieving competitive performance with fewer adaptation parameters.

Why It Matters

LORA-CRAFT addresses the growing need for efficient model fine-tuning in machine learning, particularly in natural language processing. By reducing the number of parameters required for adaptation, it enhances the feasibility of deploying large models in resource-constrained environments, making advanced AI more accessible.

Key Takeaways

LORA-CRAFT uses Tucker decomposition to optimize fine-tuning of transformer models.
The method adapts pre-trained weights with significantly fewer parameters compared to existing techniques.
Experiments show that LORA-CRAFT performs competitively on the GLUE benchmark.
This approach can facilitate the deployment of large models in environments with limited resources.
The technique bridges existing methods in tensor-based parameter-efficient fine-tuning.

Computer Science > Machine Learning arXiv:2602.17510 (cs) [Submitted on 19 Feb 2026] Title:LORA-CRAFT: Cross-layer Rank Adaptation via Frozen Tucker Decomposition of Pre-trained Attention Weights Authors:Kasun Dewage, Marianna Pensky, Suranadi De Silva, Shankadeep Mondal View a PDF of the paper titled LORA-CRAFT: Cross-layer Rank Adaptation via Frozen Tucker Decomposition of Pre-trained Attention Weights, by Kasun Dewage and Marianna Pensky and Suranadi De Silva and Shankadeep Mondal View PDF HTML (experimental) Abstract:We introduce CRAFT (Cross-layer Rank Adaptation via Frozen Tucker), a parameter-efficient fine-tuning (PEFT) method that applies Tucker tensor decomposition to pre-trained attention weight matrices stacked across transformer layers and trains only small square adaptation matrices on the resulting frozen Tucker factors. Existing tensor-based PEFT methods decompose gradient updates: LoTR applies Tucker decomposition with shared factor matrices, while SuperLoRA groups and reshapes $\Delta W$ across layers before applying Tucker decomposition. Separately, methods like PiSSA apply SVD to pre-trained weights but operate independently per layer. CRAFT bridges these two lines of work: it performs full Tucker decomposition via Higher-Order SVD (HOSVD) directly on pre-trained weights organized as cross-layer 3D tensors, freezes all resulting factors, and adapts the model through lightweight trainable transformations applied to each factor matrix. Experiments on the ...

Read Original Article

[2602.17510] LORA-CRAFT: Cross-layer Rank Adaptation via Frozen Tucker Decomposition of Pre-trained Attention Weights

Summary

Why It Matters

Key Takeaways

Related Articles

Claude Opus 4.6 API at 40% below Anthropic pricing – try free before you pay anything

[D] ICML reviewer making up false claim in acknowledgement, what to do?

UMKC Announces New Master of Science in Artificial Intelligence

[D] Budget Machine Learning Hardware

No comments

Stay updated with AI News