[2602.02660] MARS: Modular Agent with Reflective Search for Automated AI Research
Summary
The paper introduces MARS, a Modular Agent designed for automated AI research, emphasizing cost-aware planning and reflective memory to enhance efficiency and insight generation.
Why It Matters
MARS addresses the challenges faced by current AI research methodologies, particularly in balancing computational costs with performance. Its innovative approach could significantly streamline AI research processes, making it more efficient and insightful, which is crucial as AI continues to evolve.
Key Takeaways
- MARS utilizes Budget-Aware Planning to optimize performance against execution costs.
- The Modular Construction approach simplifies complex AI research tasks.
- Comparative Reflective Memory enhances insight generation by analyzing past solutions.
- MARS outperforms existing frameworks on MLE-Bench, showcasing its effectiveness.
- The agent demonstrates cross-branch learning, improving generalization across tasks.
Computer Science > Artificial Intelligence arXiv:2602.02660 (cs) [Submitted on 2 Feb 2026 (v1), last revised 17 Feb 2026 (this version, v2)] Title:MARS: Modular Agent with Reflective Search for Automated AI Research Authors:Jiefeng Chen, Bhavana Dalvi Mishra, Jaehyun Nam, Rui Meng, Tomas Pfister, Jinsung Yoon View a PDF of the paper titled MARS: Modular Agent with Reflective Search for Automated AI Research, by Jiefeng Chen and 5 other authors View PDF Abstract:Automating AI research differs from general software engineering due to computationally expensive evaluation (e.g., model training) and opaque performance attribution. Current LLM-based agents struggle here, often generating monolithic scripts that ignore execution costs and causal factors. We introduce MARS (Modular Agent with Reflective Search), a framework optimized for autonomous AI research. MARS relies on three pillars: (1) Budget-Aware Planning via cost-constrained Monte Carlo Tree Search (MCTS) to explicitly balance performance with execution expense; (2) Modular Construction, employing a "Design-Decompose-Implement" pipeline to manage complex research repositories; and (3) Comparative Reflective Memory, which addresses credit assignment by analyzing solution differences to distill high-signal insights. MARS achieves state-of-the-art performance among open-source frameworks on MLE-Bench under comparable settings, maintaining competitiveness with the global leaderboard's top methods. Furthermore, the system e...