[2603.20256] SciNav: A General Agent Framework for Scientific Coding Tasks
About this article
Abstract page for arXiv paper 2603.20256: SciNav: A General Agent Framework for Scientific Coding Tasks
Computer Science > Computation and Language arXiv:2603.20256 (cs) [Submitted on 11 Mar 2026] Title:SciNav: A General Agent Framework for Scientific Coding Tasks Authors:Tianshu Zhang, Huan Sun View a PDF of the paper titled SciNav: A General Agent Framework for Scientific Coding Tasks, by Tianshu Zhang and 1 other authors View PDF Abstract:Autonomous science agents built on large language models (LLMs) are increasingly used to generate hypotheses, design experiments, and produce reports. However, prior work mainly targets open-ended scientific problems with subjective outputs that are difficult to evaluate. Scientific coding benchmarks, by contrast, provide executable outputs for objective assessment. Existing approaches remain engineering-driven pipelines, revealing the need for structured, end-to-end science agent frameworks for scientific coding tasks. We address this gap by focusing on scientific coding tasks, where evaluation can be made rigorously, and introducing an agent framework SciNav (Scientific Navigator) that enables more effective solution exploration. Our framework is designed to operate under constrained search budgets, moving beyond reliance on pre-defined success metrics and prolonged search cycles. Inspired by findings that comparative judgments often reveal finer-grained quality differences and therefore provide greater discriminative power than absolute scoring, our framework leverages pairwise relative judgments within a tree search process to select...