[2510.11661] SR-Scientist: Scientific Equation Discovery With Agentic AI
Summary
The paper presents SR-Scientist, a framework that enhances Large Language Models (LLMs) to autonomously discover scientific equations, outperforming traditional methods in various disciplines.
Why It Matters
This research highlights the potential of agentic AI in scientific discovery, showcasing how LLMs can evolve from mere proposers to autonomous agents capable of data analysis and equation optimization. It has implications for advancing AI's role in scientific research and innovation.
Key Takeaways
- SR-Scientist transforms LLMs into autonomous AI scientists.
- The framework demonstrates significant performance improvements over baseline methods.
- It effectively handles noise and generalizes findings to new data.
- An end-to-end reinforcement learning approach enhances the agent's capabilities.
- The research spans multiple scientific disciplines, indicating broad applicability.
Computer Science > Artificial Intelligence arXiv:2510.11661 (cs) [Submitted on 13 Oct 2025 (v1), last revised 17 Feb 2026 (this version, v2)] Title:SR-Scientist: Scientific Equation Discovery With Agentic AI Authors:Shijie Xia, Yuhan Sun, Pengfei Liu View a PDF of the paper titled SR-Scientist: Scientific Equation Discovery With Agentic AI, by Shijie Xia and 2 other authors View PDF HTML (experimental) Abstract:Recently, Large Language Models (LLMs) have been applied to scientific equation discovery, leveraging their embedded scientific knowledge for hypothesis generation. However, current methods typically confine LLMs to the role of an equation proposer within search algorithms like genetic programming. In this paper, we present SR-Scientist, a framework that elevates the LLM from a simple equation proposer to an autonomous AI scientist that writes code to analyze data, implements the equation as code, submits it for evaluation, and optimizes the equation based on experimental feedback. Specifically, we wrap the code interpreter into a set of tools for data analysis and equation evaluation. The agent is instructed to optimize the equation by utilizing these tools over a long horizon with minimal human-defined pipelines. Empirical results show that SR-Scientist outperforms baseline methods by an absolute margin of 6% to 35% on datasets covering four science disciplines. Additionally, we demonstrate our method's robustness to noise, the generalization of the discovered equ...