[2603.05290] X-RAY: Mapping LLM Reasoning Capability via Formalized and Calibrated Probes
About this article
Abstract page for arXiv paper 2603.05290: X-RAY: Mapping LLM Reasoning Capability via Formalized and Calibrated Probes
Computer Science > Artificial Intelligence arXiv:2603.05290 (cs) [Submitted on 5 Mar 2026] Title:X-RAY: Mapping LLM Reasoning Capability via Formalized and Calibrated Probes Authors:Gao Tianxi, Cai Yufan, Yuan Yusi, Dong Jin Song View a PDF of the paper titled X-RAY: Mapping LLM Reasoning Capability via Formalized and Calibrated Probes, by Gao Tianxi and 3 other authors View PDF HTML (experimental) Abstract:Large language models (LLMs) achieve promising performance, yet their ability to reason remains poorly understood. Existing evaluations largely emphasize task-level accuracy, often conflating pattern matching with reasoning capability. We present X-RAY, an explainable reasoning analysis system that maps the LLM reasoning capability using calibrated, formally verified probes. We model reasoning capability as a function of extractable \textit{structure}, operationalized through formal properties such as constraint interaction, reasoning depth, and solution-space geometry. X-Ray generates probes via formal tools with controlled structural variations, enabling precise isolation of incremental structural information through formal calibration and verification. We evaluate state-of-the-art LLMs on problems ranging from junior-level to advanced in mathematics, physics, and chemistry. Our analysis reveals a systematic asymmetry in LLM reasoning: models are relatively robust to constraint refinement, where additional conditions shrink an existing solution space, but degrade shar...