[2601.05383] Imitation Learning for Combinatorial Optimisation under Uncertainty

[2601.05383] Imitation Learning for Combinatorial Optimisation under Uncertainty

arXiv - Machine Learning 4 min read Article

Summary

This paper explores imitation learning for combinatorial optimization under uncertainty, introducing a taxonomy of expert types and a new algorithm to improve learning outcomes in complex decision-making scenarios.

Why It Matters

The research addresses a critical gap in the application of imitation learning to combinatorial optimization problems, particularly under uncertainty. By providing a systematic framework for expert classification and a novel algorithm, it enhances the understanding and effectiveness of learning policies in real-world applications, such as healthcare and logistics.

Key Takeaways

  • Introduces a taxonomy of experts for imitation learning in optimization.
  • Highlights the impact of expert types on learning performance.
  • Proposes a generalized DAgger algorithm for improved interaction with experts.
  • Demonstrates that stochastic experts yield better policies than deterministic ones.
  • Interactive learning can enhance solution quality with fewer demonstrations.

Computer Science > Machine Learning arXiv:2601.05383 (cs) [Submitted on 8 Jan 2026 (v1), last revised 12 Feb 2026 (this version, v2)] Title:Imitation Learning for Combinatorial Optimisation under Uncertainty Authors:Prakash Gawas, Antoine Legrain, Louis-Martin Rousseau View a PDF of the paper titled Imitation Learning for Combinatorial Optimisation under Uncertainty, by Prakash Gawas and 1 other authors View PDF HTML (experimental) Abstract:Imitation learning (IL) provides a data-driven framework for approximating policies for large-scale combinatorial optimisation problems formulated as sequential decision problems (SDPs), where exact solution methods are computationally intractable. A central but underexplored aspect of IL in this context is the role of the \emph{expert} that generates training demonstrations. Existing studies employ a wide range of expert constructions, yet lack a unifying framework to characterise their modelling assumptions, computational properties, and impact on learning performance. This paper introduces a systematic taxonomy of experts for IL in combinatorial optimisation under uncertainty. Experts are classified along three dimensions: (i) their treatment of uncertainty, including myopic, deterministic, full-information, two-stage stochastic, and multi-stage stochastic formulations; (ii) their level of optimality, distinguishing task-optimal and approximate experts; and (iii) their interaction mode with the learner, ranging from one-shot supervis...

Related Articles

Machine Learning

Fed Chair Jerome Powell, Treasury's Bessent and top bank CEOs met over Anthropic's Mythos model

Reddit - Artificial Intelligence · 1 min ·
CoreWeave strikes a deal to power Anthropic's Claude AI models — and the stock surges 12%
Llms

CoreWeave strikes a deal to power Anthropic's Claude AI models — and the stock surges 12%

AI Tools & Products · 3 min ·
New AI model sparks alarm as governments brace for AI-driven cyberattacks
Machine Learning

New AI model sparks alarm as governments brace for AI-driven cyberattacks

AI Tools & Products · 6 min ·
Machine Learning

Anthropic Model Scare Sparks Urgent Bessent, Powell Warning to Bank CEOs

Please make sure your browser supports JavaScript and cookies and that you are not blocking them from loading. ...

AI Tools & Products · 1 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime