[2602.16923] Poisson-MNL Bandit: Nearly Optimal Dynamic Joint Assortment and Pricing with Decision-Dependent Customer Arrivals
Summary
The paper presents the Poisson-MNL model for dynamic joint assortment and pricing, addressing customer arrival dependencies to optimize revenue over time.
Why It Matters
This research is significant as it enhances traditional models by incorporating decision-dependent customer arrivals, which can lead to more effective pricing and assortment strategies in various industries, ultimately maximizing revenue.
Key Takeaways
- The Poisson-MNL model integrates customer choice and arrival rates for better pricing strategies.
- An efficient algorithm (PMNL) is developed, demonstrating near-optimal performance.
- Simulation results show PMNL outperforms models assuming fixed customer arrivals.
- The study establishes a non-asymptotic regret bound, ensuring robust performance metrics.
- This approach can be applied across various sectors to improve revenue management.
Statistics > Machine Learning arXiv:2602.16923 (stat) [Submitted on 18 Feb 2026] Title:Poisson-MNL Bandit: Nearly Optimal Dynamic Joint Assortment and Pricing with Decision-Dependent Customer Arrivals Authors:Junhui Cai, Ran Chen, Qitao Huang, Linda Zhao, Wu Zhu View a PDF of the paper titled Poisson-MNL Bandit: Nearly Optimal Dynamic Joint Assortment and Pricing with Decision-Dependent Customer Arrivals, by Junhui Cai and 4 other authors View PDF Abstract:We study dynamic joint assortment and pricing where a seller updates decisions at regular accounting/operating intervals to maximize the cumulative per-period revenue over a horizon $T$. In many settings, assortment and prices affect not only what an arriving customer buys but also how many customers arrive within the period, whereas classical multinomial logit (MNL) models assume arrivals as fixed, potentially leading to suboptimal decisions. We propose a Poisson-MNL model that couples a contextual MNL choice model with a Poisson arrival model whose rate depends on the offered assortment and prices. Building on this model, we develop an efficient algorithm PMNL based on the idea of upper confidence bound (UCB). We establish its (near) optimality by proving a non-asymptotic regret bound of order $\sqrt{T\log{T}}$ and a matching lower bound (up to $\log T$). Simulation studies underscore the importance of accounting for the dependency of arrival rates on assortment and pricing: PMNL effectively learns customer choice and ...