[2503.11572] Implicit Bias-Like Patterns in Reasoning Models
About this article
Abstract page for arXiv paper 2503.11572: Implicit Bias-Like Patterns in Reasoning Models
Computer Science > Computers and Society arXiv:2503.11572 (cs) [Submitted on 14 Mar 2025 (v1), last revised 6 Apr 2026 (this version, v4)] Title:Implicit Bias-Like Patterns in Reasoning Models Authors:Messi H.J. Lee, Calvin K. Lai View a PDF of the paper titled Implicit Bias-Like Patterns in Reasoning Models, by Messi H.J. Lee and 1 other authors View PDF HTML (experimental) Abstract:Implicit biases refer to automatic mental processes that shape perceptions, judgments, and behaviors. Previous research on "implicit bias" in LLMs focused primarily on outputs rather than the processes underlying the outputs. We present the Reasoning Model Implicit Association Test (RM-IAT) to study implicit bias-like processing in reasoning models, LLMs that use step-by-step reasoning to solve complex tasks. Using RM-IAT, we find that reasoning models like o3-mini, DeepSeek-R1, gpt-oss-20b, and Qwen-3 8B consistently expend more reasoning tokens on association-incompatible tasks than association-compatible tasks, suggesting greater computational effort when processing counter-stereotypical information. Conversely, Claude 3.7 Sonnet exhibited reversed patterns, which thematic analysis associated with its unique internal focus on reasoning about bias and stereotypes. These findings demonstrate that reasoning models exhibit distinct implicit bias-like patterns and that these patterns vary significantly depending on the models' internal reasoning content. Comments: Subjects: Computers and Society...