[2601.18987] LLMs versus the Halting Problem: Revisiting Program Termination Prediction
About this article
Abstract page for arXiv paper 2601.18987: LLMs versus the Halting Problem: Revisiting Program Termination Prediction
Computer Science > Computation and Language arXiv:2601.18987 (cs) [Submitted on 26 Jan 2026 (v1), last revised 28 Mar 2026 (this version, v4)] Title:LLMs versus the Halting Problem: Revisiting Program Termination Prediction Authors:Oren Sultan, Jordi Armengol-Estape, Pascal Kesseli, Julien Vanegue, Dafna Shahaf, Yossi Adi, Peter O'Hearn View a PDF of the paper titled LLMs versus the Halting Problem: Revisiting Program Termination Prediction, by Oren Sultan and 6 other authors View PDF HTML (experimental) Abstract:Determining whether a program terminates is a central problem in computer science. Turing's foundational result established the Halting Problem as undecidable, showing that no algorithm can universally determine termination for all programs and inputs. Consequently, automatic verification tools approximate termination, sometimes failing to prove or disprove; these tools rely on problem-specific architectures, and are usually tied to particular programming languages. Recent success and progress in large language models (LLMs) raises the following question: can LLMs reliably predict program termination? In this work, we evaluate LLMs on a diverse set of programs from the Termination category of the International Competition on Software Verification (SV-Comp) 2025. Our results suggest that LLMs perform remarkably well at predicting program termination, where GPT-5 and Claude Sonnet-4.5 would rank just behind the top-ranked tool (using test-time-scaling), and Code Wor...