[2411.11707] Federated Co-tuning Framework for Large and Small Language Models
Summary
The paper presents FedCoLLM, a federated co-tuning framework that enhances the performance of both Large Language Models (LLMs) and Small Language Models (SLMs) by facilitating knowledge transfer while ensuring data privacy.
Why It Matters
As LLMs become integral in various applications, optimizing their performance in conjunction with smaller models is crucial. FedCoLLM addresses the challenge of mutual enhancement between LLMs and SLMs, making it relevant for developers and researchers focused on efficient model training and deployment in privacy-sensitive environments.
Key Takeaways
- FedCoLLM allows for simultaneous co-tuning of LLMs and SLMs, enhancing their performance.
- The framework respects data privacy while minimizing computational and communication overhead.
- Evaluation shows significant performance improvements for SLMs with LLM assistance.
- FedCoLLM achieves results comparable to direct fine-tuning on client data.
- The code is available as part of the FATE open-source project, promoting accessibility.
Computer Science > Computation and Language arXiv:2411.11707 (cs) [Submitted on 18 Nov 2024 (v1), last revised 21 Feb 2026 (this version, v2)] Title:Federated Co-tuning Framework for Large and Small Language Models Authors:Tao Fan, Yan Kang, Guoqiang Ma, Lixin Fan, Shuoling Liu, Kai Chen, Qiang Yang View a PDF of the paper titled Federated Co-tuning Framework for Large and Small Language Models, by Tao Fan and 6 other authors View PDF HTML (experimental) Abstract:By adapting Large Language Models (LLMs) to domain-specific tasks or enriching them with domain-specific knowledge, we can fully harness the capabilities of LLMs. Nonetheless, a gap persists in achieving simultaneous mutual enhancement between the server's LLM and the downstream clients' Small Language Models (SLMs). To address this, we propose FedCoLLM, a novel and parameter-efficient federated framework designed for co-tuning LLMs and SLMs. This approach is aimed at adaptively transferring server-side LLMs knowledge to clients' SLMs while simultaneously enriching the LLMs with domain insights from the clients. To accomplish this, FedCoLLM utilizes lightweight adapters in conjunction with SLMs, facilitating knowledge exchange between server and clients in a manner that respects data privacy while also minimizing computational and communication overhead. Our evaluation of FedCoLLM, utilizing various public LLMs and SLMs across a range of NLP text generation tasks, reveals that the performance of clients' SLMs expe...