[2603.02775] From Solver to Tutor: Evaluating the Pedagogical Intelligence of LLMs with KMP-Bench
About this article
Abstract page for arXiv paper 2603.02775: From Solver to Tutor: Evaluating the Pedagogical Intelligence of LLMs with KMP-Bench
Computer Science > Computation and Language arXiv:2603.02775 (cs) [Submitted on 3 Mar 2026] Title:From Solver to Tutor: Evaluating the Pedagogical Intelligence of LLMs with KMP-Bench Authors:Weikang Shi, Houxing Ren, Junting Pan, Aojun Zhou, Ke Wang, Zimu Lu, Yunqiao Yang, Yuxuan Hu, Linda Wei, Mingjie Zhan, Hongsheng Li View a PDF of the paper titled From Solver to Tutor: Evaluating the Pedagogical Intelligence of LLMs with KMP-Bench, by Weikang Shi and 10 other authors View PDF HTML (experimental) Abstract:Large Language Models (LLMs) show significant potential in AI mathematical tutoring, yet current evaluations often rely on simplistic metrics or narrow pedagogical scenarios, failing to assess comprehensive, multi-turn teaching effectiveness. In this paper, we introduce KMP-Bench, a comprehensive K-8 Mathematical Pedagogical Benchmark designed to assess LLMs from two complementary perspectives. The first module, KMP-Dialogue, evaluates holistic pedagogical capabilities against six core principles (e.g., Challenge, Explanation, Feedback), leveraging a novel multi-turn dialogue dataset constructed by weaving together diverse pedagogical components. The second module, KMP-Skills, provides a granular assessment of foundational tutoring abilities, including multi-turn problem-solving, error detection and correction, and problem generation. Our evaluations on KMP-Bench reveal a key disparity: while leading LLMs excel at tasks with verifiable solutions, they struggle with the...