[2511.22599] DisCEdge: Distributed Context Management for Large Language Models at the Edge
About this article
Abstract page for arXiv paper 2511.22599: DisCEdge: Distributed Context Management for Large Language Models at the Edge
Computer Science > Distributed, Parallel, and Cluster Computing arXiv:2511.22599 (cs) [Submitted on 27 Nov 2025 (v1), last revised 8 Apr 2026 (this version, v2)] Title:DisCEdge: Distributed Context Management for Large Language Models at the Edge Authors:Mohammadreza Malekabbasi, Minghe Wang, David Bermbach View a PDF of the paper titled DisCEdge: Distributed Context Management for Large Language Models at the Edge, by Mohammadreza Malekabbasi and 2 other authors View PDF HTML (experimental) Abstract:Deploying Large Language Model (LLM) services at the edge benefits latency-sensitive and privacy-aware applications. However, the stateless nature of LLMs makes managing user context (e.g., sessions, preferences) across geo-distributed edge nodes challenging. Existing solutions, such as client-side context storage, introduce network latency and bandwidth overhead, undermining edge deployment advantages. We propose DisCEdge, a distributed context management system that stores and replicates user context in tokenized form across edge nodes. By maintaining context as token sequences, our system avoids redundant computation and enables efficient data replication. We evaluate an open-source prototype in a realistic edge environment. DisCEdge improves median response times by up to 14.46% and lowers median inter-node synchronization overhead by up to 15% compared to a raw-text-based system. It also reduces client request sizes by a median of 90% compared to client-side context manag...