[2602.12631] AI Agents for Inventory Control: Human-LLM-OR Complementarity
Summary
This paper explores the integration of AI agents, particularly large language models (LLMs), with traditional operations research (OR) methods for inventory control, demonstrating their complementary roles in decision-making processes.
Why It Matters
As businesses increasingly rely on AI for operational efficiency, understanding how to effectively combine human expertise, LLMs, and OR algorithms can lead to improved inventory management and profitability. This research highlights the potential for enhanced decision-making in dynamic environments.
Key Takeaways
- Combining OR algorithms with LLMs enhances inventory control performance.
- Human-AI collaboration can lead to higher profits compared to isolated operations.
- The study introduces InventoryBench, a benchmark for testing inventory decision rules.
- Contrary to previous findings, human-AI teams can outperform individual methods.
- A significant fraction of individuals benefit from AI collaboration in decision-making.
Computer Science > Artificial Intelligence arXiv:2602.12631 (cs) [Submitted on 13 Feb 2026] Title:AI Agents for Inventory Control: Human-LLM-OR Complementarity Authors:Jackie Baek, Yaopeng Fu, Will Ma, Tianyi Peng View a PDF of the paper titled AI Agents for Inventory Control: Human-LLM-OR Complementarity, by Jackie Baek and 3 other authors View PDF HTML (experimental) Abstract:Inventory control is a fundamental operations problem in which ordering decisions are traditionally guided by theoretically grounded operations research (OR) algorithms. However, such algorithms often rely on rigid modeling assumptions and can perform poorly when demand distributions shift or relevant contextual information is unavailable. Recent advances in large language models (LLMs) have generated interest in AI agents that can reason flexibly and incorporate rich contextual signals, but it remains unclear how best to incorporate LLM-based methods into traditional decision-making pipelines. We study how OR algorithms, LLMs, and humans can interact and complement each other in a multi-period inventory control setting. We construct InventoryBench, a benchmark of over 1,000 inventory instances spanning both synthetic and real-world demand data, designed to stress-test decision rules under demand shifts, seasonality, and uncertain lead times. Through this benchmark, we find that OR-augmented LLM methods outperform either method in isolation, suggesting that these methods are complementary rather tha...