[2603.22784] Caterpillar of Thoughts: The Optimal Test-Time Algorithm for Large Language Models
About this article
Abstract page for arXiv paper 2603.22784: Caterpillar of Thoughts: The Optimal Test-Time Algorithm for Large Language Models
Computer Science > Machine Learning arXiv:2603.22784 (cs) [Submitted on 24 Mar 2026] Title:Caterpillar of Thoughts: The Optimal Test-Time Algorithm for Large Language Models Authors:Amir Azarmehr, Soheil Behnezhad, Alma Ghafari View a PDF of the paper titled Caterpillar of Thoughts: The Optimal Test-Time Algorithm for Large Language Models, by Amir Azarmehr and 2 other authors View PDF HTML (experimental) Abstract:Large language models (LLMs) can often produce substantially better outputs when allowed to use additional test-time computation, such as sampling, chain of thought, backtracking, or revising partial solutions. Despite the growing empirical success of such techniques, there is limited theoretical understanding of how inference time computation should be structured, or what constitutes an optimal use of a fixed computation budget. We model test-time computation as an algorithm interacting with a Markov chain: at any point, the algorithm may resume generation from any previously observed state. That is, unlike standard Markov chains where the states are drawn passively, we allow the algorithm to backtrack to any previously observed state of the Markov chain at any time. Many of the existing test-time algorithms, such as Chain-of-Thought (CoT) (Wei et al., 2023), Tree-of-Thoughts (ToT) (Yao et al., 2023), or Best-of-$k$ (Brown et al., 2024) could be seen as specific algorithms in this model. We prove that while backtracking can reduce the number of generations expon...