[2509.17874] Deep Hierarchical Learning with Nested Subspace Networks

[2509.17874] Deep Hierarchical Learning with Nested Subspace Networks for Large Language Models

arXiv - Machine Learning March 05, 2026 4 min read

About this article

Abstract page for arXiv paper 2509.17874: Deep Hierarchical Learning with Nested Subspace Networks for Large Language Models

Computer Science > Machine Learning arXiv:2509.17874 (cs) [Submitted on 22 Sep 2025 (v1), last revised 4 Mar 2026 (this version, v2)] Title:Deep Hierarchical Learning with Nested Subspace Networks for Large Language Models Authors:Paulius Rauba, Mihaela van der Schaar View a PDF of the paper titled Deep Hierarchical Learning with Nested Subspace Networks for Large Language Models, by Paulius Rauba and 1 other authors View PDF HTML (experimental) Abstract:Large neural networks are typically trained for a fixed computational budget, creating a rigid trade-off between performance and efficiency that is ill-suited for deployment in resource-constrained or dynamic environments. Existing approaches to this problem present a difficult choice: training a discrete collection of specialist models is computationally prohibitive, while dynamic methods like slimmable networks often lack the flexibility to be applied to large, pre-trained foundation models. In this work, we propose Nested Subspace Networks (NSNs), a novel architectural paradigm that enables a single model to be dynamically and granularly adjusted across a continuous spectrum of compute budgets at inference time. The core of our approach is to re-parameterize linear layers to satisfy a nested subspace property, such that the function computed at a given rank is a strict subspace of the function at any higher rank. We show that this entire hierarchy of models can be optimized jointly via an uncertainty-aware objective tha...

Originally published on March 05, 2026. Curated by AI News.

Llms

Popular AI gateway startup LiteLLM ditches controversial startup Delve | TechCrunch

LiteLLM had obtained two security compliance certifications via Delve and fell victim to some horrific credential-stealing malware last w...

TechCrunch - AI · 3 min · about 2 hours ago

Llms

Von Hammerstein’s Ghost: What a Prussian General’s Officer Typology Can Teach Us About AI Misalignment

Greetings all - I've posted mostly in r/claudecode and r/aigamedev a couple of times previously. Working with CC for personal projects re...

Reddit - Artificial Intelligence · 1 min · about 2 hours ago

Llms

World models will be the next big thing, bye-bye LLMs

Was at Nvidia's GTC conference recently and honestly, it was one of the most eye-opening events I've attended in a while. There was a lot...

Reddit - Artificial Intelligence · 1 min · about 4 hours ago

Llms

we open sourced a tool that auto generates your AI agent context from your actual codebase, just hit 250 stars

hey everyone. been lurking here for a while and wanted to share something we been building. the problem: ai coding agents are only as goo...

Reddit - Artificial Intelligence · 1 min · about 4 hours ago

[2509.17874] Deep Hierarchical Learning with Nested Subspace Networks for Large Language Models

About this article

Related Articles

Popular AI gateway startup LiteLLM ditches controversial startup Delve | TechCrunch

Von Hammerstein’s Ghost: What a Prussian General’s Officer Typology Can Teach Us About AI Misalignment

World models will be the next big thing, bye-bye LLMs

we open sourced a tool that auto generates your AI agent context from your actual codebase, just hit 250 stars

No comments

Stay updated with AI News