[2509.23365] Emergence of Superposition: Unveiling the Training

[2509.23365] Emergence of Superposition: Unveiling the Training Dynamics of Chain of Continuous Thought

arXiv - Machine Learning March 03, 2026 4 min read

About this article

Abstract page for arXiv paper 2509.23365: Emergence of Superposition: Unveiling the Training Dynamics of Chain of Continuous Thought

Computer Science > Machine Learning arXiv:2509.23365 (cs) [Submitted on 27 Sep 2025 (v1), last revised 1 Mar 2026 (this version, v3)] Title:Emergence of Superposition: Unveiling the Training Dynamics of Chain of Continuous Thought Authors:Hanlin Zhu, Shibo Hao, Zhiting Hu, Jiantao Jiao, Stuart Russell, Yuandong Tian View a PDF of the paper titled Emergence of Superposition: Unveiling the Training Dynamics of Chain of Continuous Thought, by Hanlin Zhu and 5 other authors View PDF HTML (experimental) Abstract:Previous work shows that the chain of continuous thought (continuous CoT) improves the reasoning capability of large language models (LLMs) by enabling implicit parallel thinking, and a subsequent work provided theoretical insight by showing that a two-layer transformer equipped with continuous CoT can efficiently solve directed graph reachability by maintaining a superposition of multiple reasoning traces in the continuous thought. However, it remains unclear how the superposition mechanism is naturally learned from gradient-based training methods. To fill this gap, we theoretically analyze the training dynamics of a simplified two-layer transformer on the directed graph reachability problem to unveil how the superposition mechanism emerges during training in two training stages -- (i) a thought-generation stage that autoregressively expands the continuous thought, and (ii) a prediction stage that converts the thought into the final answer. Our analysis reveals that du...

Originally published on March 03, 2026. Curated by AI News.

Llms

Claude code x n8n

Hi everyone, I’ve been exploring MCP and integrating tools like n8n with Claude Code, and I’m trying to understand how practical this rea...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Llms

LLM comprehension question

Basically, does anyone else also get a really strange sense of lingering confusion and non-comprehension when an LLM explains a complex c...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Llms

Curated 550+ free AI tools useful for building projects (LLMs, APIs, local models, RAG, agents)

Over the last few days I was collecting free or low cost AI tools that are actually useful if you want to build stuff, not just try rando...

Reddit - Artificial Intelligence · 1 min · about 5 hours ago

Llms

Claude Mythos and misguided open-weight fearmongering

AI Tools & Products · 9 min · about 9 hours ago

[2509.23365] Emergence of Superposition: Unveiling the Training Dynamics of Chain of Continuous Thought

About this article

Related Articles

Claude code x n8n

LLM comprehension question

Curated 550+ free AI tools useful for building projects (LLMs, APIs, local models, RAG, agents)

Claude Mythos and misguided open-weight fearmongering

No comments

Stay updated with AI News