[2604.01880] DDCL-INCRT: A Self-Organising Transformer with Hierarchical Prototype Structure (Theoretical Foundations)

[2604.01880] DDCL-INCRT: A Self-Organising Transformer with Hierarchical Prototype Structure (Theoretical Foundations)

arXiv - Machine Learning 4 min read

About this article

Abstract page for arXiv paper 2604.01880: DDCL-INCRT: A Self-Organising Transformer with Hierarchical Prototype Structure (Theoretical Foundations)

Computer Science > Machine Learning arXiv:2604.01880 (cs) [Submitted on 2 Apr 2026] Title:DDCL-INCRT: A Self-Organising Transformer with Hierarchical Prototype Structure (Theoretical Foundations) Authors:Giansalvo Cirrincione View a PDF of the paper titled DDCL-INCRT: A Self-Organising Transformer with Hierarchical Prototype Structure (Theoretical Foundations), by Giansalvo Cirrincione View PDF HTML (experimental) Abstract:Modern neural networks of the transformer family require the practitioner to decide, before training begins, how many attention heads to use, how deep the network should be, and how wide each component should be. These decisions are made without knowledge of the task, producing architectures that are systematically larger than necessary: empirical studies find that a substantial fraction of heads and layers can be removed after training without performance loss. This paper introduces DDCL-INCRT, an architecture that determines its own structure during training. Two complementary ideas are combined. The first, DDCL (Deep Dual Competitive Learning), replaces the feedforward block with a dictionary of learned prototype vectors representing the most informative directions in the data. The prototypes spread apart automatically, driven by the training objective, without explicit regularisation. The second, INCRT (Incremental Transformer), controls the number of heads: starting from one, it adds a new head only when the directional information uncaptured by exi...

Originally published on April 03, 2026. Curated by AI News.

Related Articles

Machine learning analysis of CT scans
Machine Learning

Machine learning analysis of CT scans

An AI-powered tool can interpret 3D images from CT scans and diagnose certain disorders.

AI News - General · 5 min ·
Teaching AI models to say “I’m not sure”
Machine Learning

Teaching AI models to say “I’m not sure”

MIT CSAIL's “Reinforcement Learning with Calibration Rewards” technique improves AI confidence estimates without sacrificing perform...

AI News - General · 7 min ·
Accelerating science with AI and simulations
Machine Learning

Accelerating science with AI and simulations

MIT Professor Rafael Gómez-Bombarelli discusses the transformative potential of AI in scientific research, emphasizing its role in materi...

AI News - General · 10 min ·
A Machine Learning Engineer Thought He Was Safe From AI Layoffs. Then He Got Some Depressing News
Machine Learning

A Machine Learning Engineer Thought He Was Safe From AI Layoffs. Then He Got Some Depressing News

AI News - General · 4 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime