[2603.01692] Reasoning as Gradient: Scaling MLE Agents Beyond Tree Search

[2603.01692] Reasoning as Gradient: Scaling MLE Agents Beyond Tree Search

arXiv - AI 4 min read

About this article

Abstract page for arXiv paper 2603.01692: Reasoning as Gradient: Scaling MLE Agents Beyond Tree Search

Computer Science > Machine Learning arXiv:2603.01692 (cs) [Submitted on 2 Mar 2026] Title:Reasoning as Gradient: Scaling MLE Agents Beyond Tree Search Authors:Yifei Zhang, Xu Yang, Xiao Yang, Bowen Xian, Qizheng Li, Shikai Fang, Jingyuan Li, Jian Wang, Mingrui Xu, Weiqing Liu, Jiang Bian View a PDF of the paper titled Reasoning as Gradient: Scaling MLE Agents Beyond Tree Search, by Yifei Zhang and 10 other authors View PDF HTML (experimental) Abstract:LLM-based agents for machine learning engineering (MLE) predominantly rely on tree search, a form of gradient-free optimization that uses scalar validation scores to rank candidates. As LLM reasoning capabilities improve, exhaustive enumeration becomes increasingly inefficient compared to directed updates, analogous to how accurate gradients enable efficient descent over random search. We introduce \textsc{Gome}, an MLE agent that operationalizes gradient-based optimization. \textsc{Gome} maps structured diagnostic reasoning to gradient computation, success memory to momentum, and multi-trace execution to distributed optimization. Under a closed-world protocol that isolates architectural effects from external knowledge, \textsc{Gome} achieves a state-of-the-art 35.1\% any-medal rate on MLE-Bench with a restricted 12-hour budget on a single V100 GPU. Scaling experiments across 10 models reveal a critical crossover: with weaker models, tree search retains advantages by compensating for unreliable reasoning through exhaustive ex...

Originally published on March 03, 2026. Curated by AI News.

Related Articles

Claude Mythos and Project Glasswing: why an AI superhacker has the tech world on alert
Llms

Claude Mythos and Project Glasswing: why an AI superhacker has the tech world on alert

A new AI model could automate the process of searching for cybersecurity bugs and flaws – for better or worse.

AI Tools & Products · 5 min ·
Gemini could take a 'proactive' approach with leaked 'Your Day' feature
Llms

Gemini could take a 'proactive' approach with leaked 'Your Day' feature

This feature could leverage your apps in a way that might feel familiar.

AI Tools & Products · 5 min ·
I ditched my paper planner for Gemini Live — and it solved the one professional problem I couldn't fix
Llms

I ditched my paper planner for Gemini Live — and it solved the one professional problem I couldn't fix

Can Gemini Live replace a physical planner? Tom's Guide AI Editor Amanda Caswell ditched her notebook for Google’s voice AI. Here’s how i...

AI Tools & Products · 8 min ·
Anthropic is facing a wave of user backlash over reports of performance issues with its Claude AI chatbot
Llms

Anthropic is facing a wave of user backlash over reports of performance issues with its Claude AI chatbot

"Claude has regressed to the point [that] it cannot be trusted to perform complex engineering," one developer wrote.

AI Tools & Products · 12 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime