[2510.10815] DRIFT: Decompose, Retrieve, Illustrate, then Formalize Theorems
About this article
Abstract page for arXiv paper 2510.10815: DRIFT: Decompose, Retrieve, Illustrate, then Formalize Theorems
Computer Science > Artificial Intelligence arXiv:2510.10815 (cs) [Submitted on 12 Oct 2025 (v1), last revised 6 Apr 2026 (this version, v4)] Title:DRIFT: Decompose, Retrieve, Illustrate, then Formalize Theorems Authors:Meiru Zhang, Philipp Borchert, Milan Gritta, Gerasimos Lampouras View a PDF of the paper titled DRIFT: Decompose, Retrieve, Illustrate, then Formalize Theorems, by Meiru Zhang and 3 other authors View PDF HTML (experimental) Abstract:Automating the formalization of mathematical statements for theorem proving remains a major challenge for Large Language Models (LLMs). LLMs struggle to identify and utilize the prerequisite mathematical knowledge and its corresponding formal representation in languages like Lean. Current retrieval-augmented autoformalization methods query external libraries using the informal statement directly, but overlook a fundamental limitation: informal statements lack direct mappings to mathematical theorems and lemmata, nor do those theorems translate trivially into the formal primitives of languages like Lean. To address this, we introduce DRIFT, a novel framework that enables LLMs to decompose informal mathematical statements into smaller, more tractable "sub-components". This facilitates targeted retrieval of premises from mathematical libraries such as Mathlib. Additionally, DRIFT retrieves illustrative theorems to help models use premises more effectively in formalization tasks. We evaluate DRIFT across diverse benchmarks (ProofNet...