[2602.12952] Transporting Task Vectors across Different Architectures without Training
Summary
The paper introduces 'Theseus,' a novel method for transferring task-specific updates across different model architectures without retraining, enhancing efficiency in machine learning applications.
Why It Matters
This research addresses the challenge of adapting pre-trained models to various tasks, which typically requires costly parameter updates. By enabling task updates to be transferred across models of different architectures, it opens new avenues for efficiency in deploying AI systems, particularly in resource-constrained environments.
Key Takeaways
- Theseus allows for task-specific updates to be transferred across heterogeneous model architectures.
- The method utilizes functional matching on observed activations rather than direct parameter matching.
- Improvements were demonstrated in both vision and language models without additional training.
- The approach preserves the geometry of updates, enhancing stability and effectiveness.
- This research could significantly reduce the computational costs associated with model adaptation.
Computer Science > Machine Learning arXiv:2602.12952 (cs) [Submitted on 13 Feb 2026] Title:Transporting Task Vectors across Different Architectures without Training Authors:Filippo Rinaldi, Aniello Panariello, Giacomo Salici, Angelo Porrello, Simone Calderara View a PDF of the paper titled Transporting Task Vectors across Different Architectures without Training, by Filippo Rinaldi and 4 other authors View PDF HTML (experimental) Abstract:Adapting large pre-trained models to downstream tasks often produces task-specific parameter updates that are expensive to relearn for every model variant. While recent work has shown that such updates can be transferred between models with identical architectures, transferring them across models of different widths remains largely unexplored. In this work, we introduce Theseus, a training-free method for transporting task-specific updates across heterogeneous models. Rather than matching parameters directly, we characterize a task update by the functional effect it induces on intermediate representations. We formalize task-vector transport as a functional matching problem on observed activations and show that, after aligning representation spaces via orthogonal Procrustes analysis, it admits a stable closed-form solution that preserves the geometry of the update. We evaluate Theseus on vision and language models across different widths, showing consistent improvements over strong baselines without additional training or backpropagation. ...