[2510.04040] FaithCoT-Bench: Benchmarking Instance-Level Faithfulness of Chain-of-Thought Reasoning
About this article
Abstract page for arXiv paper 2510.04040: FaithCoT-Bench: Benchmarking Instance-Level Faithfulness of Chain-of-Thought Reasoning
Computer Science > Artificial Intelligence arXiv:2510.04040 (cs) [Submitted on 5 Oct 2025 (v1), last revised 28 Feb 2026 (this version, v2)] Title:FaithCoT-Bench: Benchmarking Instance-Level Faithfulness of Chain-of-Thought Reasoning Authors:Xu Shen, Song Wang, Zhen Tan, Laura Yao, Xinyu Zhao, Kaidi Xu, Xin Wang, Tianlong Chen View a PDF of the paper titled FaithCoT-Bench: Benchmarking Instance-Level Faithfulness of Chain-of-Thought Reasoning, by Xu Shen and 7 other authors View PDF Abstract:Large language models (LLMs) increasingly rely on Chain-of-Thought (CoT) prompting to improve problem-solving and provide seemingly transparent explanations. However, growing evidence shows that CoT often fail to faithfully represent the underlying reasoning process, raising concerns about their reliability in high-risk applications. Although prior studies have focused on mechanism-level analyses showing that CoTs can be unfaithful, they leave open the practical challenge of deciding whether a specific trajectory is faithful to the internal reasoning of the model. To address this gap, we introduce FaithCoT-Bench, a unified benchmark for instance-level CoT unfaithfulness detection. Our framework establishes a rigorous task formulation that formulates unfaithfulness detection as a discriminative decision problem, and provides FINE-CoT (Faithfulness instance evaluation for Chain-of-Thought), an expert-annotated collection of over 1,000 trajectories generated by four representative LLMs ac...