[2603.04247] Online Learning for Multi-Layer Hierarchical Inference under Partial and Policy-Dependent Feedback

[2603.04247] Online Learning for Multi-Layer Hierarchical Inference under Partial and Policy-Dependent Feedback

arXiv - AI 4 min read

About this article

Abstract page for arXiv paper 2603.04247: Online Learning for Multi-Layer Hierarchical Inference under Partial and Policy-Dependent Feedback

Computer Science > Machine Learning arXiv:2603.04247 (cs) [Submitted on 4 Mar 2026] Title:Online Learning for Multi-Layer Hierarchical Inference under Partial and Policy-Dependent Feedback Authors:Haoran Zhang, Seohyeon Cha, Hasan Burhan Beytur, Kevin S Chan, Gustavo de Veciana, Haris Vikalo View a PDF of the paper titled Online Learning for Multi-Layer Hierarchical Inference under Partial and Policy-Dependent Feedback, by Haoran Zhang and 5 other authors View PDF HTML (experimental) Abstract:Hierarchical inference systems route tasks across multiple computational layers, where each node may either finalize a prediction locally or offload the task to a node in the next layer for further processing. Learning optimal routing policies in such systems is challenging: inference loss is defined recursively across layers, while feedback on prediction error is revealed only at a terminal oracle layer. This induces a partial, policy-dependent feedback structure in which observability probabilities decay with depth, causing importance-weighted estimators to suffer from amplified variance. We study online routing for multi-layer hierarchical inference under long-term resource constraints and terminal-only feedback. We formalize the recursive loss structure and show that naive importance-weighted contextual bandit methods become unstable as feedback probability decays along the hierarchy. To address this, we develop a variance-reduced EXP4-based algorithm integrated with Lyapunov opti...

Originally published on March 05, 2026. Curated by AI News.

Related Articles

Machine Learning

[R] Are there ML approaches for prioritizing and routing “important” signals across complex systems?

I’ve been reading more about attention mechanisms in transformers and how they effectively learn to weight and prioritize relevant inputs...

Reddit - Machine Learning · 1 min ·
Llms

[P] I trained a language model from scratch for a low resource language and got it running fully on-device on Android (no GPU, demo)

Hi Everybody! I just wanted to share an update on a project I’ve been working on called BULaMU, a family of language models trained (20M,...

Reddit - Machine Learning · 1 min ·
Machine Learning

[R] Structure Over Scale: Memory-First Reasoning and Depth-Pruned Efficiency in Magnus and Seed Architecture Auto-Discovery

Dataset Model Acc F1 Δ vs Log Δ vs Static Avg Params Peak Params Steps Infer ms Size Banking77-20 Logistic TF-IDF 92.37% 0.9230 +0.00pp +...

Reddit - Machine Learning · 1 min ·
UM Computer Scientists Land Grant to Improve Models of Melting Greenland Glaciers
Machine Learning

UM Computer Scientists Land Grant to Improve Models of Melting Greenland Glaciers

Two UM researchers are using advanced neural networks, machine learning and artificial intelligence to improve climate models to better p...

AI News - General · 5 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime