[2602.17229] Mechanistic Interpretability of Cognitive Complexity in LLMs via Linear Probing using Bloom's Taxonomy

[2602.17229] Mechanistic Interpretability of Cognitive Complexity in LLMs via Linear Probing using Bloom's Taxonomy

arXiv - AI 3 min read Article

Summary

This paper explores the mechanistic interpretability of cognitive complexity in Large Language Models (LLMs) using Bloom's Taxonomy, demonstrating that cognitive levels are encoded in the model's representations.

Why It Matters

Understanding how LLMs process cognitive complexity is crucial for improving their design and evaluation. This research provides insights into the internal workings of LLMs, potentially guiding future developments in AI interpretability and educational applications.

Key Takeaways

  • The study uses Bloom's Taxonomy to evaluate cognitive complexity in LLMs.
  • Linear classifiers achieved approximately 95% accuracy in distinguishing cognitive levels.
  • Cognitive difficulty is resolved early in the model's forward pass.
  • Representations become increasingly separable across model layers.
  • Findings contribute to the understanding of LLM interpretability.

Computer Science > Artificial Intelligence arXiv:2602.17229 (cs) [Submitted on 19 Feb 2026] Title:Mechanistic Interpretability of Cognitive Complexity in LLMs via Linear Probing using Bloom's Taxonomy Authors:Bianca Raimondi, Maurizio Gabbrielli View a PDF of the paper titled Mechanistic Interpretability of Cognitive Complexity in LLMs via Linear Probing using Bloom's Taxonomy, by Bianca Raimondi and Maurizio Gabbrielli View PDF HTML (experimental) Abstract:The black-box nature of Large Language Models necessitates novel evaluation frameworks that transcend surface-level performance metrics. This study investigates the internal neural representations of cognitive complexity using Bloom's Taxonomy as a hierarchical lens. By analyzing high-dimensional activation vectors from different LLMs, we probe whether different cognitive levels, ranging from basic recall (Remember) to abstract synthesis (Create), are linearly separable within the model's residual streams. Our results demonstrate that linear classifiers achieve approximately 95% mean accuracy across all Bloom levels, providing strong evidence that cognitive level is encoded in a linearly accessible subspace of the model's representations. These findings provide evidence that the model resolves the cognitive difficulty of a prompt early in the forward pass, with representations becoming increasingly separable across layers. Comments: Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL) Cite as: arX...

Related Articles

Llms

wtf bro did what? arc 3 2026

The Physarum Explorer is a high-speed, bio-inspired neural model designed specifically for ARC geometry. Here is the snapshot of its curr...

Reddit - Artificial Intelligence · 1 min ·
Llms

A robot car with a Claude AI brain started a YouTube vlog about its own existence

Not a demo reel. Not a tutorial. A robot narrating its own experience — debugging, falling off shelves, questioning its identity. First-p...

Reddit - Artificial Intelligence · 1 min ·
Llms

Study: LLMs Able to De-Anonymize User Accounts on Reddit, Hacker News & Other "Pseudonymous" Platforms; Report Co-Author Expands, Advises

Advice from the study's co-author: "Be aware that it’s not any single post that identifies you, but the combination of small details acro...

Reddit - Artificial Intelligence · 1 min ·
Llms

do you guys actually trust AI tools with your data?

idk if it’s just me but lately i’ve been thinking about how casually we use stuff like chatgpt and claude for everything like coding, ran...

Reddit - Artificial Intelligence · 1 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime