[2601.05383] Imitation Learning for Combinatorial Optimisation under Uncertainty
Summary
This paper explores imitation learning for combinatorial optimization under uncertainty, introducing a taxonomy of expert types and a new algorithm to improve learning outcomes in complex decision-making scenarios.
Why It Matters
The research addresses a critical gap in the application of imitation learning to combinatorial optimization problems, particularly under uncertainty. By providing a systematic framework for expert classification and a novel algorithm, it enhances the understanding and effectiveness of learning policies in real-world applications, such as healthcare and logistics.
Key Takeaways
- Introduces a taxonomy of experts for imitation learning in optimization.
- Highlights the impact of expert types on learning performance.
- Proposes a generalized DAgger algorithm for improved interaction with experts.
- Demonstrates that stochastic experts yield better policies than deterministic ones.
- Interactive learning can enhance solution quality with fewer demonstrations.
Computer Science > Machine Learning arXiv:2601.05383 (cs) [Submitted on 8 Jan 2026 (v1), last revised 12 Feb 2026 (this version, v2)] Title:Imitation Learning for Combinatorial Optimisation under Uncertainty Authors:Prakash Gawas, Antoine Legrain, Louis-Martin Rousseau View a PDF of the paper titled Imitation Learning for Combinatorial Optimisation under Uncertainty, by Prakash Gawas and 1 other authors View PDF HTML (experimental) Abstract:Imitation learning (IL) provides a data-driven framework for approximating policies for large-scale combinatorial optimisation problems formulated as sequential decision problems (SDPs), where exact solution methods are computationally intractable. A central but underexplored aspect of IL in this context is the role of the \emph{expert} that generates training demonstrations. Existing studies employ a wide range of expert constructions, yet lack a unifying framework to characterise their modelling assumptions, computational properties, and impact on learning performance. This paper introduces a systematic taxonomy of experts for IL in combinatorial optimisation under uncertainty. Experts are classified along three dimensions: (i) their treatment of uncertainty, including myopic, deterministic, full-information, two-stage stochastic, and multi-stage stochastic formulations; (ii) their level of optimality, distinguishing task-optimal and approximate experts; and (iii) their interaction mode with the learner, ranging from one-shot supervis...