[2510.14686] xLLM Technical Report

[2510.14686] xLLM Technical Report

arXiv - AI 4 min read

About this article

Abstract page for arXiv paper 2510.14686: xLLM Technical Report

Computer Science > Distributed, Parallel, and Cluster Computing arXiv:2510.14686 (cs) [Submitted on 16 Oct 2025 (v1), last revised 3 Mar 2026 (this version, v2)] Title:xLLM Technical Report Authors:Tongxuan Liu, Tao Peng, Peijun Yang, Xiaoyang Zhao, Xiusheng Lu, Weizhe Huang, Zirui Liu, Xiaoyu Chen, Zhiwei Liang, Jun Xiong, Donghe Jin, Minchao Zhang, Jinrong Guo, Yingxu Deng, Xu Zhang, Xianzhe Dong, Siqi Wang, Siyu Wu, Yu Wu, Zihan Tang, Yuting Zeng, Yanshu Wang, Jinguang Liu, Meng Kang, Menxin Li, Yunlong Wang, Yiming Liu, Xiaolong Ma, Yifan Wang, Yichen Zhang, Jinrun Yin, Keyang Zheng, Jiawei Yin, Jun Zhang, Ziyue Wang, Xiaobo Lin, Liangyu Liu, Liwei Lan, Yang Liu, Chunhua Peng, Han Liu, Songcheng Ren, Xuezhu Wang, Yunheng Shen, Yi Wang, Guyue Liu, Yitao Hu, Hui Chen, Tong Yang, Hailong Yang, Jing Li, Guiguang Ding, Ke Zhang View a PDF of the paper titled xLLM Technical Report, by Tongxuan Liu and 52 other authors View PDF HTML (experimental) Abstract:We introduce xLLM, an intelligent and efficient Large Language Model (LLM) inference framework designed for high-performance, large-scale enterprise-grade serving, with deep optimizations for diverse AI accelerators. To address these challenges, xLLM builds a novel decoupled service-engine architecture. At the service layer, xLLM-Service features an intelligent scheduling module that efficiently processes multimodal requests and co-locates online and offline tasks through unified elastic scheduling to maximize cluster utili...

Originally published on March 04, 2026. Curated by AI News.

Related Articles

[2603.18532] Scaling Sim-to-Real Reinforcement Learning for Robot VLAs with Generative 3D Worlds
Llms

[2603.18532] Scaling Sim-to-Real Reinforcement Learning for Robot VLAs with Generative 3D Worlds

Abstract page for arXiv paper 2603.18532: Scaling Sim-to-Real Reinforcement Learning for Robot VLAs with Generative 3D Worlds

arXiv - AI · 4 min ·
[2603.12702] FGTR: Fine-Grained Multi-Table Retrieval via Hierarchical LLM Reasoning
Llms

[2603.12702] FGTR: Fine-Grained Multi-Table Retrieval via Hierarchical LLM Reasoning

Abstract page for arXiv paper 2603.12702: FGTR: Fine-Grained Multi-Table Retrieval via Hierarchical LLM Reasoning

arXiv - Machine Learning · 4 min ·
[2603.12681] Colluding LoRA: A Compositional Vulnerability in LLM Safety Alignment
Llms

[2603.12681] Colluding LoRA: A Compositional Vulnerability in LLM Safety Alignment

Abstract page for arXiv paper 2603.12681: Colluding LoRA: A Compositional Vulnerability in LLM Safety Alignment

arXiv - Machine Learning · 3 min ·
[2602.06098] A Theoretical Analysis of Test-Driven LLM Code Generation
Llms

[2602.06098] A Theoretical Analysis of Test-Driven LLM Code Generation

Abstract page for arXiv paper 2602.06098: A Theoretical Analysis of Test-Driven LLM Code Generation

arXiv - AI · 3 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime