[2602.13021] Prior-Guided Symbolic Regression: Towards Scientific Consistency in Equation Discovery
Summary
The paper presents Prior-Guided Symbolic Regression (PG-SR), a novel framework designed to enhance scientific consistency in equation discovery by integrating domain priors and a structured pipeline.
Why It Matters
This research addresses the challenge of generating equations that not only fit data but also adhere to fundamental scientific principles. By introducing a systematic approach to symbolic regression, it aims to improve the reliability of models in scientific research, which is crucial for advancing knowledge in various fields.
Key Takeaways
- PG-SR framework enhances symbolic regression by incorporating prior knowledge.
- The three-stage pipeline includes warm-up, evolution, and refinement phases.
- Introduces a prior constraint checker to ensure scientific consistency.
- Demonstrates improved performance over existing methods across various domains.
- Addresses issues with data quality and scarcity effectively.
Computer Science > Machine Learning arXiv:2602.13021 (cs) [Submitted on 13 Feb 2026] Title:Prior-Guided Symbolic Regression: Towards Scientific Consistency in Equation Discovery Authors:Jing Xiao, Xinhai Chen, Jiaming Peng, Qinglin Wang, Menghan Jia, Zhiquan Lai, Guangping Yu, Dongsheng Li, Tiejun Li, Jie Liu View a PDF of the paper titled Prior-Guided Symbolic Regression: Towards Scientific Consistency in Equation Discovery, by Jing Xiao and 9 other authors View PDF HTML (experimental) Abstract:Symbolic Regression (SR) aims to discover interpretable equations from observational data, with the potential to reveal underlying principles behind natural phenomena. However, existing approaches often fall into the Pseudo-Equation Trap: producing equations that fit observations well but remain inconsistent with fundamental scientific principles. A key reason is that these approaches are dominated by empirical risk minimization, lacking explicit constraints to ensure scientific consistency. To bridge this gap, we propose PG-SR, a prior-guided SR framework built upon a three-stage pipeline consisting of warm-up, evolution, and refinement. Throughout the pipeline, PG-SR introduces a prior constraint checker that explicitly encodes domain priors as executable constraint programs, and employs a Prior Annealing Constrained Evaluation (PACE) mechanism during the evolution stage to progressively steer discovery toward scientifically consistent regions. Theoretically, we prove that PG-SR ...