Llms Machine Learning Ai Agents Data Science

[2510.09658] Gradient-Sign Masking for Task Vector Transport Across Pre-Trained Models

arXiv - Machine Learning February 23, 2026 4 min read Article

Summary

This paper presents Gradient-Sign Masking, a method for transferring task vectors across pre-trained models without additional fine-tuning, enhancing performance in machine learning tasks.

Why It Matters

As foundation models evolve, practitioners often need to fine-tune models for similar tasks. This research offers a method to reuse task vectors effectively, reducing the need for repeated fine-tuning and improving efficiency in model adaptation, which is crucial for advancing AI applications.

Key Takeaways

Gradient-Sign Masking allows for effective transfer of task vectors across different pre-trained models.
The method requires no additional fine-tuning, relying instead on gradient computations.
Empirical results show significant performance improvements on vision and language benchmarks.
The approach ensures first-order descent, providing a theoretical guarantee of effectiveness.
Transporting task vectors enhances multi-task and multi-source model merging capabilities.

Computer Science > Machine Learning arXiv:2510.09658 (cs) [Submitted on 7 Oct 2025 (v1), last revised 20 Feb 2026 (this version, v3)] Title:Gradient-Sign Masking for Task Vector Transport Across Pre-Trained Models Authors:Filippo Rinaldi, Aniello Panariello, Giacomo Salici, Fengyuan Liu, Marco Ciccone, Angelo Porrello, Simone Calderara View a PDF of the paper titled Gradient-Sign Masking for Task Vector Transport Across Pre-Trained Models, by Filippo Rinaldi and 6 other authors View PDF HTML (experimental) Abstract:When a new release of a foundation model is published, practitioners typically need to repeat fine-tuning, even if the same task was already tackled in the previous version. A promising alternative is to reuse the parameter changes (i.e., task vectors) that capture how a model adapts to a specific task. However, these vectors often fail to transfer across different pre-trained models because their parameter spaces are misaligned. In this work, we show that successful transfer depends strongly on the gradient-sign structure of the new model. Based on this insight, we propose GradFix, which approximates the ideal sign structure and leverages it to transfer knowledge using only a handful of labeled samples. Notably, this requires no additional fine-tuning: we only compute a few target-model gradients without parameter updates and mask the source task vector accordingly. This yields an update that is locally aligned with the target loss landscape, effectively rebasi...

Read Original Article

[2510.09658] Gradient-Sign Masking for Task Vector Transport Across Pre-Trained Models

Summary

Why It Matters

Key Takeaways

Related Articles

How to use the new ChatGPT app integrations, including DoorDash, Spotify, Uber, and others | TechCrunch

Anthropic Restricts Claude Agent Access Amid AI Automation Boom in Crypto

Is cutting ‘please’ when talking to ChatGPT better for the planet? An expert explains

AI Desktop 98 lets you chat with Claude, ChatGPT, and Gemini through a Windows 98-inspired interface

No comments

Stay updated with AI News