[2603.23566] AscendOptimizer: Episodic Agent for Ascend NPU Operator

[2603.23566] AscendOptimizer: Episodic Agent for Ascend NPU Operator Optimization

arXiv - Machine Learning March 26, 2026 4 min read

About this article

Abstract page for arXiv paper 2603.23566: AscendOptimizer: Episodic Agent for Ascend NPU Operator Optimization

Computer Science > Machine Learning arXiv:2603.23566 (cs) [Submitted on 24 Mar 2026] Title:AscendOptimizer: Episodic Agent for Ascend NPU Operator Optimization Authors:Jiehao Wu, Zixiao Huang, Wenhao Li, Chuyun Shen, Junjie Sheng, Xiangfeng Wang View a PDF of the paper titled AscendOptimizer: Episodic Agent for Ascend NPU Operator Optimization, by Jiehao Wu and 5 other authors View PDF HTML (experimental) Abstract:AscendC (Ascend C) operator optimization on Huawei Ascend neural processing units (NPUs) faces a two-fold knowledge bottleneck: unlike the CUDA ecosystem, there are few public reference implementations to learn from, and performance hinges on a coupled two-part artifact - a host-side tiling program that orchestrates data movement and a kernel program that schedules and pipelines instructions. We present AscendOptimizer, an episodic agent that bootstraps this missing expertise by turning execution into experience. On the host side, AscendOptimizer performs profiling-in-the-loop evolutionary search to discover valid and high-performing tiling and data-movement configurations directly from hardware feedback. On the kernel side, it mines transferable optimization motifs by rewinding optimized kernels - systematically de-optimizing them to synthesize instructive "bad-to-good" trajectories - and distills these motifs into a retrievable experience bank for guided rewriting. By alternating host tuning and kernel rewriting in a closed loop, AscendOptimizer steadily expand...

Originally published on March 26, 2026. Curated by AI News.

Llms

[R] GPT-5.4-mini regressed 22pp on vanilla prompting vs GPT-5-mini. Nobody noticed because benchmarks don't test this. Recursive Language Models solved it.

GPT-5.4-mini produces shorter, terser outputs by default. Vanilla accuracy dropped from 69.5% to 47.2% across 12 tasks (1,800 evals). The...

Reddit - Machine Learning · 1 min · about 1 hour ago

Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min · about 2 hours ago

Machine Learning

[R] First open-source implementation of Hebbian fast-weight write-back for the BDH architecture

The BDH (Dragon Hatchling) paper (arXiv:2509.26507) describes a Hebbian synaptic plasticity mechanism where model weights update during i...

Reddit - Machine Learning · 1 min · about 4 hours ago

Ai Infrastructure

Persistent memory changes how people interact with AI — here's what I'm observing

I run a small AI companion platform and wanted to share some interesting behavioral data from users who've been using persistent cross-se...

Reddit - Artificial Intelligence · 1 min · about 7 hours ago

[2603.23566] AscendOptimizer: Episodic Agent for Ascend NPU Operator Optimization

About this article

Related Articles

[R] GPT-5.4-mini regressed 22pp on vanilla prompting vs GPT-5-mini. Nobody noticed because benchmarks don't test this. Recursive Language Models solved it.

UMKC Announces New Master of Science in Artificial Intelligence

[R] First open-source implementation of Hebbian fast-weight write-back for the BDH architecture

Persistent memory changes how people interact with AI — here's what I'm observing

No comments

Stay updated with AI News