[2603.02376] CUCo: An Agentic Framework for Compute and Communication

[2603.02376] CUCo: An Agentic Framework for Compute and Communication Co-design

arXiv - Machine Learning March 04, 2026 3 min read

About this article

Abstract page for arXiv paper 2603.02376: CUCo: An Agentic Framework for Compute and Communication Co-design

Computer Science > Distributed, Parallel, and Cluster Computing arXiv:2603.02376 (cs) [Submitted on 2 Mar 2026] Title:CUCo: An Agentic Framework for Compute and Communication Co-design Authors:Bodun Hu, Yoga Sri Varshan V, Saurabh Agarwal, Aditya Akella View a PDF of the paper titled CUCo: An Agentic Framework for Compute and Communication Co-design, by Bodun Hu and 3 other authors View PDF HTML (experimental) Abstract:Custom CUDA kernel development is essential for maximizing GPU utilization in large-scale distributed LLM training and inference, yet manually writing kernels that jointly leverage both computation and communication remains a labor-intensive and error-prone process. Prior work on kernel optimization has focused almost exclusively on computation, leaving communication kernels largely untouched even though they constitute a significant share of total execution time. We introduce CUCo, a training-free agent-driven workflow that automatically generates high-performance CUDA kernels that jointly orchestrate computation and communication. By co-optimizing these traditionally disjoint components, CUCo unlocks new optimization opportunities unavailable to existing approaches, outperforming state-of-the-art baselines and reducing end-to-end latency by up to $1.57\times$. Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Hardware Architecture (cs.AR); Machine Learning (cs.LG); Multiagent Systems (cs.MA) Cite as: arXiv:2603.02376 [cs.DC] (or arXiv:2603....

Originally published on March 04, 2026. Curated by AI News.

Llms

[2603.16629] MLLM-based Textual Explanations for Face Comparison

Abstract page for arXiv paper 2603.16629: MLLM-based Textual Explanations for Face Comparison

arXiv - AI · 4 min · 33 minutes ago

Llms

[2603.15159] To See is Not to Master: Teaching LLMs to Use Private Libraries for Code Generation

Abstract page for arXiv paper 2603.15159: To See is Not to Master: Teaching LLMs to Use Private Libraries for Code Generation

arXiv - AI · 4 min · 33 minutes ago

Llms

[2602.08316] SWE Context Bench: A Benchmark for Context Learning in Coding

Abstract page for arXiv paper 2602.08316: SWE Context Bench: A Benchmark for Context Learning in Coding

arXiv - AI · 4 min · 33 minutes ago

Llms

[2601.13227] Insider Knowledge: How Much Can RAG Systems Gain from Evaluation Secrets?

Abstract page for arXiv paper 2601.13227: Insider Knowledge: How Much Can RAG Systems Gain from Evaluation Secrets?

arXiv - AI · 3 min · 33 minutes ago

[2603.02376] CUCo: An Agentic Framework for Compute and Communication Co-design

About this article

Related Articles

[2603.16629] MLLM-based Textual Explanations for Face Comparison

[2603.15159] To See is Not to Master: Teaching LLMs to Use Private Libraries for Code Generation

[2602.08316] SWE Context Bench: A Benchmark for Context Learning in Coding

[2601.13227] Insider Knowledge: How Much Can RAG Systems Gain from Evaluation Secrets?

No comments

Stay updated with AI News