[2601.16294] Space Filling Curves is All You Need: Communication-Avoiding Matrix Multiplication Made Simple
About this article
Abstract page for arXiv paper 2601.16294: Space Filling Curves is All You Need: Communication-Avoiding Matrix Multiplication Made Simple
Computer Science > Distributed, Parallel, and Cluster Computing arXiv:2601.16294 (cs) [Submitted on 22 Jan 2026 (v1), last revised 7 Apr 2026 (this version, v2)] Title:Space Filling Curves is All You Need: Communication-Avoiding Matrix Multiplication Made Simple Authors:Evangelos Georganas, Alexander Heinecke, Pradeep Dubey View a PDF of the paper titled Space Filling Curves is All You Need: Communication-Avoiding Matrix Multiplication Made Simple, by Evangelos Georganas and 2 other authors View PDF HTML (experimental) Abstract:General Matrix Multiplication (GEMM) is the cornerstone of HPC workloads and Deep Learning. State-of-the-art vendor libraries tune tensor layouts, parallelization schemes, and cache blocking to minimize data movement across the memory hierarchy and maximize throughput. Optimal settings for these parameters depend on the target platform and matrix shapes, making exhaustive tuning infeasible. We revisit Space Filling Curves (SFC) to alleviate this cumbersome tuning. We partition the Matrix Multiplication using advancements in SFC, and obtain platform-oblivious and shape-oblivious Matrix Multiplication schemes with high degree of data locality. We extend the SFC-based work partitioning to implement Communication-Avoiding (CA) algorithms that provably minimize data movement. The integration of CA-algorithms is seamless with compact code, achieving state-of-the-art results on multiple CPU platforms, outperforming vendor libraries up to 5.5x for a range o...