Machine Learning Nlp Ai Infrastructure Ai Agents

[2602.13215] When to Think Fast and Slow? AMOR: Entropy-Based Metacognitive Gate for Dynamic SSM-Attention Switching

arXiv - AI February 17, 2026 3 min read Article

Summary

The paper presents AMOR, an entropy-based metacognitive gate that enhances attention switching in state space models, improving efficiency and retrieval accuracy in AI tasks.

Why It Matters

This research addresses limitations in traditional transformer models by introducing a hybrid architecture that optimizes attention allocation based on uncertainty, potentially advancing AI efficiency and effectiveness in complex tasks.

Key Takeaways

AMOR utilizes prediction entropy to determine when to engage attention, enhancing computational efficiency.
The model outperforms both SSM-only and transformer-only approaches in retrieval tasks.
AMOR achieves perfect retrieval accuracy while using attention on only 22% of positions.
The approach offers interpretable adaptive computation, linking routing decisions to information theory.
This research could influence future designs of AI architectures by integrating cognitive theories.

Computer Science > Artificial Intelligence arXiv:2602.13215 (cs) [Submitted on 22 Jan 2026] Title:When to Think Fast and Slow? AMOR: Entropy-Based Metacognitive Gate for Dynamic SSM-Attention Switching Authors:Haoran Zheng View a PDF of the paper titled When to Think Fast and Slow? AMOR: Entropy-Based Metacognitive Gate for Dynamic SSM-Attention Switching, by Haoran Zheng View PDF HTML (experimental) Abstract:Transformers allocate uniform computation to every position, regardless of difficulty. State Space Models (SSMs) offer efficient alternatives but struggle with precise information retrieval over a long horizon. Inspired by dual-process theories of cognition (Kahneman, 2011), we propose AMOR (Adaptive Metacognitive Output Router), a hybrid architecture that dynamically engages sparse attention only when an SSM backbone is "uncertain"--as measured by prediction entropy. Compared to standard transformers, AMOR gains efficiency by projecting keys and values from SSM hidden states (Ghost KV), reusing the SSM's O(n) computation rather than requiring O(n^2) attention at every layer. On small-scale synthetic retrieval tasks, AMOR outperforms both SSM-only and transformer-only baselines, achieving perfect retrieval accuracy while engaging attention on only 22% of positions. We validate that prediction entropy reliably signals retrieval need, with a gap of 1.09 nats (nearly half the entropy range) between retrieval and local positions. Additionally, our approach provides interp...

Read Original Article