[2510.09312] Verifying Chain-of-Thought Reasoning via Its Computational Graph
Summary
The paper presents a novel method for verifying Chain-of-Thought (CoT) reasoning in AI models using Circuit-based Reasoning Verification (CRV), which analyzes computational graphs to identify reasoning errors.
Why It Matters
Understanding and improving AI reasoning is crucial for enhancing the reliability of machine learning models. This research provides a new framework for diagnosing errors in reasoning processes, which can lead to better model performance and insight into AI decision-making.
Key Takeaways
- Introduces a white-box method for verifying reasoning in AI models.
- Identifies distinct structural patterns in computational graphs that correlate with reasoning errors.
- Demonstrates that targeted interventions can correct faulty reasoning based on graph analysis.
- Highlights the domain-specific nature of reasoning errors across different tasks.
- Moves beyond error detection to a causal understanding of AI reasoning processes.
Computer Science > Computation and Language arXiv:2510.09312 (cs) [Submitted on 10 Oct 2025 (v1), last revised 22 Feb 2026 (this version, v2)] Title:Verifying Chain-of-Thought Reasoning via Its Computational Graph Authors:Zheng Zhao, Yeskendir Koishekenov, Xianjun Yang, Naila Murray, Nicola Cancedda View a PDF of the paper titled Verifying Chain-of-Thought Reasoning via Its Computational Graph, by Zheng Zhao and 4 other authors View PDF HTML (experimental) Abstract:Current Chain-of-Thought (CoT) verification methods predict reasoning correctness based on outputs (black-box) or activations (gray-box), but offer limited insight into why a computation fails. We introduce a white-box method: Circuit-based Reasoning Verification (CRV). We hypothesize that attribution graphs of correct CoT steps, viewed as execution traces of the model's latent reasoning circuits, possess distinct structural fingerprints from those of incorrect steps. By training a classifier on structural features of these graphs, we show that these traces contain a powerful signal of reasoning errors. Our white-box approach yields novel scientific insights unattainable by other methods. (1) We demonstrate that structural signatures of error are highly predictive, establishing the viability of verifying reasoning directly via its computational graph. (2) We find these signatures to be highly domain-specific, revealing that failures in different reasoning tasks manifest as distinct computational patterns. (3) We...