[2210.10278] A Reinforcement Learning Approach in Multi-Phase Second-Price Auction Design
About this article
Abstract page for arXiv paper 2210.10278: A Reinforcement Learning Approach in Multi-Phase Second-Price Auction Design
Computer Science > Machine Learning arXiv:2210.10278 (cs) [Submitted on 19 Oct 2022 (v1), last revised 2 Mar 2026 (this version, v2)] Title:A Reinforcement Learning Approach in Multi-Phase Second-Price Auction Design Authors:Rui Ai, Boxiang Lyu, Zhaoran Wang, Zhuoran Yang, Michael I. Jordan View a PDF of the paper titled A Reinforcement Learning Approach in Multi-Phase Second-Price Auction Design, by Rui Ai and 3 other authors View PDF HTML (experimental) Abstract:We study reserve price optimization in multi-phase second price auctions, where the seller's prior actions affect the bidders' later valuations through a Markov Decision Process (MDP). Compared to the bandit setting in existing works, the setting in ours involves three challenges. First, from the seller's perspective, we need to efficiently explore the environment in the presence of potentially untruthful bidders who aim to manipulate the seller's policy. Second, we want to minimize the seller's revenue regret when the market noise distribution is unknown. Third, the seller's per-step revenue is an unknown, nonlinear random variable, and cannot even be directly observed from the environment but realized values. We propose a mechanism addressing all three challenges. To address the first challenge, we use a combination of a new technique named "buffer periods" and inspirations from Reinforcement Learning (RL) with low switching cost to limit bidders' surplus from untruthful bidding, thereby incentivizing approxima...