Machine Learning Ai Agents Data Science

[2602.17798] Grassmannian Mixture-of-Experts: Concentration-Controlled Routing on Subspace Manifolds

arXiv - Machine Learning February 23, 2026 4 min read Article

Summary

The paper presents Grassmannian Mixture-of-Experts (GrMoE), a novel routing framework that enhances expert assignment in machine learning models by controlling sparsity and utilization through a concentration matrix on subspace manifolds.

Why It Matters

GrMoE addresses limitations in traditional Mixture-of-Experts models by providing a continuous mechanism for routing control, which can lead to improved model performance and interpretability. This innovation is significant for advancing machine learning techniques that require efficient resource allocation and expert utilization.

Key Takeaways

GrMoE introduces a concentration matrix that controls routing entropy for expert assignment.
The framework allows for uncertainty-aware expert assignment, reducing expert collapse.
It achieves better load balance and perplexity compared to traditional models.
The model supports post-hoc sparsity tuning without the need for retraining.
Experts exhibit heterogeneous concentration values, indicating specialization in tasks.

Computer Science > Machine Learning arXiv:2602.17798 (cs) [Submitted on 19 Feb 2026] Title:Grassmannian Mixture-of-Experts: Concentration-Controlled Routing on Subspace Manifolds Authors:Ibne Farabi Shihab, Sanjeda Akter, Anuj Sharma View a PDF of the paper titled Grassmannian Mixture-of-Experts: Concentration-Controlled Routing on Subspace Manifolds, by Ibne Farabi Shihab and 2 other authors View PDF HTML (experimental) Abstract:Mixture-of-Experts models rely on learned routers to assign tokens to experts, yet standard softmax gating provides no principled mechanism to control the tradeoff between sparsity and utilization. We propose Grassmannian MoE (GrMoE), a routing framework that operates on the Grassmannian manifold of subspaces, where gating weights arise from the concentration parameters of Matrix Bingham distributions. This construction yields a single, interpretable knob -- the concentration matrix $\Lambda$ -- that continuously controls routing entropy, replacing discrete top-$k$ selection with a smooth, geometrically principled sparsity mechanism. We further develop an amortized variational inference procedure for posterior routing distributions, enabling uncertainty-aware expert assignment that naturally resists expert collapse. We formally prove tight bounds relating the Bingham concentration spectrum to routing entropy, expected top-$k$ mass, and an exponential bound on expert collapse, establishing the first formal theory of concentration-controlled sparsit...

Read Original Article

[2602.17798] Grassmannian Mixture-of-Experts: Concentration-Controlled Routing on Subspace Manifolds

Summary

Why It Matters

Key Takeaways

Related Articles

Google quietly releases an offline-first AI dictation app on iOS | TechCrunch

How well do you understand how AI/deep learning works?

a fun survey to look at how consumers perceive the use of AI in fashion brand marketing. (all ages, all genders)

I Built a Functional Cognitive Engine

No comments

Stay updated with AI News