[2603.02376] CUCo: An Agentic Framework for Compute and Communication Co-design

[2603.02376] CUCo: An Agentic Framework for Compute and Communication Co-design

arXiv - Machine Learning 3 min read

About this article

Abstract page for arXiv paper 2603.02376: CUCo: An Agentic Framework for Compute and Communication Co-design

Computer Science > Distributed, Parallel, and Cluster Computing arXiv:2603.02376 (cs) [Submitted on 2 Mar 2026] Title:CUCo: An Agentic Framework for Compute and Communication Co-design Authors:Bodun Hu, Yoga Sri Varshan V, Saurabh Agarwal, Aditya Akella View a PDF of the paper titled CUCo: An Agentic Framework for Compute and Communication Co-design, by Bodun Hu and 3 other authors View PDF HTML (experimental) Abstract:Custom CUDA kernel development is essential for maximizing GPU utilization in large-scale distributed LLM training and inference, yet manually writing kernels that jointly leverage both computation and communication remains a labor-intensive and error-prone process. Prior work on kernel optimization has focused almost exclusively on computation, leaving communication kernels largely untouched even though they constitute a significant share of total execution time. We introduce CUCo, a training-free agent-driven workflow that automatically generates high-performance CUDA kernels that jointly orchestrate computation and communication. By co-optimizing these traditionally disjoint components, CUCo unlocks new optimization opportunities unavailable to existing approaches, outperforming state-of-the-art baselines and reducing end-to-end latency by up to $1.57\times$. Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Hardware Architecture (cs.AR); Machine Learning (cs.LG); Multiagent Systems (cs.MA) Cite as: arXiv:2603.02376 [cs.DC]   (or arXiv:2603....

Originally published on March 04, 2026. Curated by AI News.

Related Articles

[2603.16629] MLLM-based Textual Explanations for Face Comparison
Llms

[2603.16629] MLLM-based Textual Explanations for Face Comparison

Abstract page for arXiv paper 2603.16629: MLLM-based Textual Explanations for Face Comparison

arXiv - AI · 4 min ·
[2603.15159] To See is Not to Master: Teaching LLMs to Use Private Libraries for Code Generation
Llms

[2603.15159] To See is Not to Master: Teaching LLMs to Use Private Libraries for Code Generation

Abstract page for arXiv paper 2603.15159: To See is Not to Master: Teaching LLMs to Use Private Libraries for Code Generation

arXiv - AI · 4 min ·
[2602.08316] SWE Context Bench: A Benchmark for Context Learning in Coding
Llms

[2602.08316] SWE Context Bench: A Benchmark for Context Learning in Coding

Abstract page for arXiv paper 2602.08316: SWE Context Bench: A Benchmark for Context Learning in Coding

arXiv - AI · 4 min ·
[2601.13227] Insider Knowledge: How Much Can RAG Systems Gain from Evaluation Secrets?
Llms

[2601.13227] Insider Knowledge: How Much Can RAG Systems Gain from Evaluation Secrets?

Abstract page for arXiv paper 2601.13227: Insider Knowledge: How Much Can RAG Systems Gain from Evaluation Secrets?

arXiv - AI · 3 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime