[2603.21331] AutoKernel: Autonomous GPU Kernel Optimization via

[2603.21331] AutoKernel: Autonomous GPU Kernel Optimization via Iterative Agent-Driven Search

arXiv - Machine Learning March 24, 2026 4 min read

About this article

Abstract page for arXiv paper 2603.21331: AutoKernel: Autonomous GPU Kernel Optimization via Iterative Agent-Driven Search

Computer Science > Machine Learning arXiv:2603.21331 (cs) [Submitted on 22 Mar 2026] Title:AutoKernel: Autonomous GPU Kernel Optimization via Iterative Agent-Driven Search Authors:Jaber Jaber, Osama Jaber View a PDF of the paper titled AutoKernel: Autonomous GPU Kernel Optimization via Iterative Agent-Driven Search, by Jaber Jaber and 1 other authors View PDF HTML (experimental) Abstract:Writing high-performance GPU kernels is among the most labor-intensive tasks in machine learning systems engineering. We present AutoKernel, an open-source framework that applies an autonomous agent loop to GPU kernel optimization for arbitrary PyTorch models. Given a model, AutoKernel profiles it to identify computational bottlenecks, ranks them by Amdahl's law impact, and iteratively refines Triton or CUDA C++ kernel implementations through hundreds of experiments without human intervention. A five-stage correctness harness covering smoke tests, shape sweeps, numerical stability, determinism verification, and edge-case coverage ensures every candidate kernel is validated before any speedup is recorded. The system comprises over 9,000 lines of Python, 18 starter kernel implementations across two backends, a six-tier optimization playbook, and integration with the KernelBench benchmark suite. AutoKernel covers nine kernel types spanning the dominant operations in modern transformer architectures. On an NVIDIA H100, our Triton kernels outperform both PyTorch eager and this http URL (max-aut...

Originally published on March 24, 2026. Curated by AI News.

Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min · about 3 hours ago

Machine Learning

[D] Looking for definition of open-world ish learning problem

Hello! Recently I did a project where I initially had around 30 target classes. But at inference, the model had to be able to handle a lo...

Reddit - Machine Learning · 1 min · about 3 hours ago

Machine Learning

Mystery Shopping Meets Machine Learning: Can Algorithms Become the Ultimate Customer Experience Auditor?

Customer expectations across Africa are shifting faster than most organisations can track. A single inconsistent interaction can ignite a...

AI News - General · 8 min · about 4 hours ago

Machine Learning

GitHub to Use User Data for AI Training by Default

submitted by /u/i-drake [link] [comments]

Reddit - Artificial Intelligence · 1 min · about 4 hours ago

[2603.21331] AutoKernel: Autonomous GPU Kernel Optimization via Iterative Agent-Driven Search

About this article

Related Articles

UMKC Announces New Master of Science in Artificial Intelligence

[D] Looking for definition of open-world ish learning problem

Mystery Shopping Meets Machine Learning: Can Algorithms Become the Ultimate Customer Experience Auditor?

GitHub to Use User Data for AI Training by Default

No comments

Stay updated with AI News