[2604.04986] Enhancing sample efficiency in reinforcement-learning-based flow control: replacing the critic with an adaptive reduced-order model
About this article
Abstract page for arXiv paper 2604.04986: Enhancing sample efficiency in reinforcement-learning-based flow control: replacing the critic with an adaptive reduced-order model
Computer Science > Machine Learning arXiv:2604.04986 (cs) [Submitted on 5 Apr 2026] Title:Enhancing sample efficiency in reinforcement-learning-based flow control: replacing the critic with an adaptive reduced-order model Authors:Zesheng Yao, Zhen-Hua Wan, Canjun Yang, Qingchao Xia, Mengqi Zhang View a PDF of the paper titled Enhancing sample efficiency in reinforcement-learning-based flow control: replacing the critic with an adaptive reduced-order model, by Zesheng Yao and 4 other authors View PDF HTML (experimental) Abstract:Model-free deep reinforcement learning (DRL) methods suffer from poor sample efficiency. To overcome this limitation, this work introduces an adaptive reduced-order-model (ROM)-based reinforcement learning framework for active flow control. In contrast to conventional actor--critic architectures, the proposed approach leverages a ROM to estimate the gradient information required for controller optimization. The design of the ROM structure incorporates physical insights. The ROM integrates a linear dynamical system and a neural ordinary differential equation (NODE) for estimating the nonlinearity in the flow. The parameters of the linear component are identified via operator inference, while the NODE is trained in a data-driven manner using gradient-based optimization. During controller--environment interactions, the ROM is continuously updated with newly collected data, enabling adaptive refinement of the model. The controller is then optimized thro...