[2603.17233] Draft-and-Prune: Improving the Reliability of Auto-formalization for Logical Reasoning
About this article
Abstract page for arXiv paper 2603.17233: Draft-and-Prune: Improving the Reliability of Auto-formalization for Logical Reasoning
Computer Science > Artificial Intelligence arXiv:2603.17233 (cs) [Submitted on 18 Mar 2026 (v1), last revised 26 Mar 2026 (this version, v2)] Title:Draft-and-Prune: Improving the Reliability of Auto-formalization for Logical Reasoning Authors:Zhiyu Ni, Zheng Liang, Liangcheng Song, Chenrui Cao, Xian Zhang, Alberto Sangiovanni-Vincentelli, Pierluigi Nuzzo View a PDF of the paper titled Draft-and-Prune: Improving the Reliability of Auto-formalization for Logical Reasoning, by Zhiyu Ni and 6 other authors View PDF HTML (experimental) Abstract:Auto-formalization (AF) translates natural-language reasoning problems into solver-executable programs, enabling symbolic solvers to perform sound logical deduction. In practice, however, AF pipelines are currently brittle: programs may fail to execute, or execute but encode incorrect semantics. While prior work largely mitigates syntactic failures via repairs based on solver feedback, reducing semantics failures remains a major bottleneck. We propose Draft-and-Prune (D&P), an inference-time framework that improves AF-based logical reasoning via diversity and verification. D&P first drafts multiple natural-language plans and conditions program generation on them. It further prunes executable but contradictory or ambiguous formalizations, and aggregates predictions from surviving paths via majority voting. Across four representative benchmarks (AR-LSAT, ProofWriter, PrOntoQA, LogicalDeduction), D&P substantially strengthens AF-based reaso...