[2602.12259] Think like a Scientist: Physics-guided LLM Agent for Equation Discovery
Summary
The paper introduces KeplerAgent, a physics-guided LLM framework designed for symbolic equation discovery, enhancing accuracy and robustness in scientific reasoning.
Why It Matters
This research addresses the limitations of current LLMs in scientific contexts by integrating multi-step reasoning processes akin to those used by scientists. It demonstrates the potential of AI to improve the discovery of symbolic equations, which is crucial for advancing scientific knowledge and applications.
Key Takeaways
- KeplerAgent models the scientific reasoning process for equation discovery.
- It outperforms traditional LLMs and baselines in symbolic accuracy.
- The framework enhances robustness against noisy data.
- Utilizes physics-based tools to inform symbolic regression engines.
- Demonstrates the potential for AI to assist in scientific research.
Computer Science > Artificial Intelligence arXiv:2602.12259 (cs) [Submitted on 12 Feb 2026 (v1), last revised 24 Feb 2026 (this version, v2)] Title:Think like a Scientist: Physics-guided LLM Agent for Equation Discovery Authors:Jianke Yang, Ohm Venkatachalam, Mohammad Kianezhad, Sharvaree Vadgama, Rose Yu View a PDF of the paper titled Think like a Scientist: Physics-guided LLM Agent for Equation Discovery, by Jianke Yang and 4 other authors View PDF HTML (experimental) Abstract:Explaining observed phenomena through symbolic, interpretable formulas is a fundamental goal of science. Recently, large language models (LLMs) have emerged as promising tools for symbolic equation discovery, owing to their broad domain knowledge and strong reasoning capabilities. However, most existing LLM-based systems try to guess equations directly from data, without modeling the multi-step reasoning process that scientists often follow: first inferring physical properties such as symmetries, then using these as priors to restrict the space of candidate equations. We introduce KeplerAgent, an agentic framework that explicitly follows this scientific reasoning process. The agent coordinates physics-based tools to extract intermediate structure and uses these results to configure symbolic regression engines such as PySINDy and PySR, including their function libraries and structural constraints. Across a suite of physical equation benchmarks, KeplerAgent achieves substantially higher symbolic accu...