[2602.19672] SkillOrchestra: Learning to Route Agents via Skill Transfer
Summary
The paper presents SkillOrchestra, a framework for skill-aware orchestration in AI systems, improving agent routing through skill transfer and reducing learning costs significantly.
Why It Matters
As AI systems become more complex, effective orchestration is crucial for maximizing their capabilities. SkillOrchestra addresses limitations in existing routing methods, offering a more efficient and interpretable approach that enhances performance while minimizing costs. This advancement is significant for developers and researchers in AI, particularly in optimizing multi-agent systems.
Key Takeaways
- SkillOrchestra outperforms existing RL-based orchestrators by up to 22.5%.
- The framework reduces learning costs by 700x compared to Router-R1.
- It offers a principled alternative to data-intensive RL approaches.
- Skill modeling enables scalable and interpretable orchestration.
- The method enhances performance in multi-turn scenarios.
Computer Science > Artificial Intelligence arXiv:2602.19672 (cs) [Submitted on 23 Feb 2026] Title:SkillOrchestra: Learning to Route Agents via Skill Transfer Authors:Jiayu Wang, Yifei Ming, Zixuan Ke, Shafiq Joty, Aws Albarghouthi, Frederic Sala View a PDF of the paper titled SkillOrchestra: Learning to Route Agents via Skill Transfer, by Jiayu Wang and 5 other authors View PDF Abstract:Compound AI systems promise capabilities beyond those of individual models, yet their success depends critically on effective orchestration. Existing routing approaches face two limitations: (1) input-level routers make coarse query-level decisions that ignore evolving task requirements; (2) RL-trained orchestrators are expensive to adapt and often suffer from routing collapse, repeatedly invoking one strong but costly option in multi-turn scenarios. We introduce SkillOrchestra, a framework for skill-aware orchestration. Instead of directly learning a routing policy end-to-end, SkillOrchestra learns fine-grained skills from execution experience and models agent-specific competence and cost under those skills. At deployment, the orchestrator infers the skill demands of the current interaction and selects agents that best satisfy them under an explicit performance-cost trade-off. Extensive experiments across ten benchmarks demonstrate that SkillOrchestra outperforms SoTA RL-based orchestrators by up to 22.5% with 700x and 300x learning cost reduction compared to Router-R1 and ToolOrchestra, r...