[2604.04720] What Makes Good Multilingual Reasoning? Disentangling Reasoning Traces with Measurable Features
About this article
Abstract page for arXiv paper 2604.04720: What Makes Good Multilingual Reasoning? Disentangling Reasoning Traces with Measurable Features
Computer Science > Computation and Language arXiv:2604.04720 (cs) [Submitted on 6 Apr 2026] Title:What Makes Good Multilingual Reasoning? Disentangling Reasoning Traces with Measurable Features Authors:Dayeon Ki, Kevin Duh, Marine Carpuat View a PDF of the paper titled What Makes Good Multilingual Reasoning? Disentangling Reasoning Traces with Measurable Features, by Dayeon Ki and 2 other authors View PDF Abstract:Large Reasoning Models (LRMs) still exhibit large performance gaps between English and other languages, yet much current work assumes these gaps can be closed simply by making reasoning in every language resemble English reasoning. This work challenges this assumption by asking instead: what actually characterizes effective reasoning in multilingual settings, and to what extent do English-derived reasoning features genuinely help in other languages? We first define a suite of measurable reasoning features spanning multilingual alignment, reasoning step, and reasoning flow aspects of reasoning traces, and use logistic regression to quantify how each feature associates with final answer accuracy. We further train sparse autoencoders over multilingual traces to automatically discover latent reasoning concepts that instantiate or extend these features. Finally, we use the features as test-time selection policies to examine whether they can steer models toward stronger multilingual reasoning. Across two mathematical reasoning benchmarks, four LRMs, and 10 languages, w...