[2601.18936] Bi-Level Online Provisioning and Scheduling with Switching Costs and Cross-Level Constraints
Summary
This paper explores a bi-level online provisioning and scheduling problem, focusing on network resource allocation with varying constraints and switching costs, proposing a novel algorithm for optimal decision-making.
Why It Matters
The study addresses critical gaps in existing online convex optimization and constrained Markov decision processes by incorporating dynamic constraints and switching costs, which are essential for improving efficiency in network resource management. This research has implications for various applications in machine learning and operations research.
Key Takeaways
- Introduces a bi-level online provisioning and scheduling framework.
- Addresses limitations of existing online convex optimization methods.
- Proposes a novel algorithm that incorporates switching costs.
- Establishes near-optimal regret and satisfaction of constraints.
- Highlights the importance of adaptive budget management in scheduling.
Computer Science > Machine Learning arXiv:2601.18936 (cs) [Submitted on 26 Jan 2026 (v1), last revised 20 Feb 2026 (this version, v2)] Title:Bi-Level Online Provisioning and Scheduling with Switching Costs and Cross-Level Constraints Authors:Jialei Liu, C. Emre Koksal, Ming Shi View a PDF of the paper titled Bi-Level Online Provisioning and Scheduling with Switching Costs and Cross-Level Constraints, by Jialei Liu and 2 other authors View PDF HTML (experimental) Abstract:We study a bi-level online provisioning and scheduling problem motivated by network resource allocation, where provisioning decisions are made at a slow time scale while queue-/state-dependent scheduling is performed at a fast time scale. We model this two-time-scale interaction using an upper-level online convex optimization (OCO) problem and a lower-level constrained Markov decision process (CMDP). Existing OCO typically assumes stateless decisions and thus cannot capture MDP network dynamics such as queue evolution. Meanwhile, CMDP algorithms typically assume a fixed constraint threshold, whereas in provisioning-and-scheduling systems, the threshold varies with online budget decisions. To address these gaps, we study bi-level OCO-CMDP learning under switching costs (budget reprovisioning/system reconfiguration) and cross-level constraints that couple budgets to scheduling decisions. Our new algorithm solves this learning problem via several non-trivial developments, including a carefully designed dual f...