[2510.24702] Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents
About this article
Abstract page for arXiv paper 2510.24702: Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents
Computer Science > Computation and Language arXiv:2510.24702 (cs) [Submitted on 28 Oct 2025 (v1), last revised 4 Mar 2026 (this version, v2)] Title:Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents Authors:Yueqi Song, Ketan Ramaneti, Zaid Sheikh, Ziru Chen, Boyu Gou, Tianbao Xie, Yiheng Xu, Danyang Zhang, Apurva Gandhi, Fan Yang, Joseph Liu, Tianyue Ou, Zhihao Yuan, Frank Xu, Shuyan Zhou, Xingyao Wang, Xiang Yue, Tao Yu, Huan Sun, Yu Su, Graham Neubig View a PDF of the paper titled Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents, by Yueqi Song and 20 other authors View PDF HTML (experimental) Abstract:Public research results on large-scale supervised finetuning of AI agents remain relatively rare, since the collection of agent training data presents unique challenges. In this work, we argue that the bottleneck is not a lack of underlying data sources, but that a large variety of data is fragmented across heterogeneous formats, tools, and interfaces. To this end, we introduce the agent data protocol (ADP), a light-weight representation language that serves as an "interlingua" between agent datasets in diverse formats and unified agent training pipelines downstream. The design of ADP is expressive enough to capture a large variety of tasks, including API/tool use, browsing, coding, software engineering, and general agentic workflows, while remaining simple to parse and train on without engineering ...