[2603.26778] TED: Training-Free Experience Distillation for Multimodal Reasoning
About this article
Abstract page for arXiv paper 2603.26778: TED: Training-Free Experience Distillation for Multimodal Reasoning
Computer Science > Machine Learning arXiv:2603.26778 (cs) [Submitted on 25 Mar 2026] Title:TED: Training-Free Experience Distillation for Multimodal Reasoning Authors:Shuozhi Yuan, Jinqing Wang, Zihao Liu, Miaomiao Yuan, Haoran Peng, Jin Zhao, Bingwen Wang, Haoyi Wang View a PDF of the paper titled TED: Training-Free Experience Distillation for Multimodal Reasoning, by Shuozhi Yuan and Jinqing Wang and Zihao Liu and Miaomiao Yuan and Haoran Peng and Jin Zhao and Bingwen Wang and Haoyi Wang View PDF HTML (experimental) Abstract:Knowledge distillation is typically realized by transferring a teacher model's knowledge into a student's parameters through supervised or reinforcement-based optimization. While effective, such approaches require repeated parameter updates and large-scale training data, limiting their applicability in resource-constrained environments. In this work, we propose TED, a training-free, context-based distillation framework that shifts the update target of distillation from model parameters to an in-context experience injected into the student's prompt. For each input, the student generates multiple reasoning trajectories, while a teacher independently produces its own solution. The teacher then compares the student trajectories with its reasoning and the ground-truth answer, extracting generalized experiences that capture effective reasoning patterns. These experiences are continuously refined and updated over time. A key challenge of context-based disti...