[2408.10746] Resource-Efficient Personal Large Language Models Fine-Tuning with Collaborative Edge Computing
Summary
The paper presents PAC, a collaborative edge computing framework designed for resource-efficient fine-tuning of personal large language models (LLMs), addressing computational challenges and enhancing training efficiency.
Why It Matters
As the demand for personal LLMs grows, efficient fine-tuning methods are crucial for maintaining data privacy and security while optimizing resource use. PAC offers a novel solution that could significantly improve the feasibility of deploying LLMs on edge devices, making advanced AI applications more accessible.
Key Takeaways
- PAC framework enables efficient fine-tuning of personal LLMs on edge devices.
- Utilizes Parallel Adapters and an activation cache to reduce computational load.
- Achieves up to 8.64x speedup and 88.16% reduction in memory usage compared to existing methods.
- Addresses privacy concerns by shifting LLM training from cloud to edge.
- Combines algorithmic and systematic innovations for enhanced training efficiency.
Computer Science > Distributed, Parallel, and Cluster Computing arXiv:2408.10746 (cs) [Submitted on 20 Aug 2024 (v1), last revised 14 Feb 2026 (this version, v2)] Title:Resource-Efficient Personal Large Language Models Fine-Tuning with Collaborative Edge Computing Authors:Shengyuan Ye, Bei Ouyang, Tianyi Qian, Liekang Zeng, Jingyi Li, Jiangsu Du, Xiaowen Chu, Guoliang Xing, Xu Chen View a PDF of the paper titled Resource-Efficient Personal Large Language Models Fine-Tuning with Collaborative Edge Computing, by Shengyuan Ye and 8 other authors View PDF HTML (experimental) Abstract:Large language models (LLMs) have unlocked a plethora of powerful applications at the network edge, such as intelligent personal assistants. Data privacy and security concerns have prompted a shift towards edge-based fine-tuning of personal LLMs, away from cloud reliance. However, this raises issues of computational intensity and resource scarcity, hindering training efficiency and feasibility. While current studies investigate parameter-efficient fine-tuning (PEFT) techniques to mitigate resource constraints, our analysis indicates that these techniques are not sufficiently resource-efficient for edge devices. To tackle these challenges, we propose Pluto and Charon (PAC), a time and memory efficient collaborative edge AI framework for personal LLMs fine-tuning. PAC breaks the resource wall of personal LLMs fine-tuning with a sophisticated algorithm-system co-design. (1) Algorithmically, PAC imple...