[2501.16613] Safe Reinforcement Learning for Real-World Engine Control
Summary
This article presents a novel toolchain for implementing safe reinforcement learning in real-world engine control, specifically for transient load control in internal combustion engines using the DDPG algorithm.
Why It Matters
The research addresses critical safety concerns in applying reinforcement learning to engine control, a field where traditional methods often fail due to complex dynamics. By ensuring safety through real-time monitoring, this work opens avenues for more efficient and environmentally friendly engine management, promoting the use of renewable fuels.
Key Takeaways
- Introduces a toolchain for safe reinforcement learning in engine control.
- Demonstrates application on a single-cylinder internal combustion engine in HCCI mode.
- Implements real-time safety monitoring to prevent engine damage.
- Achieves competitive performance with a root mean square error of 0.1374 bar.
- Promotes renewable fuel use through adaptable control policies.
Computer Science > Machine Learning arXiv:2501.16613 (cs) [Submitted on 28 Jan 2025 (v1), last revised 24 Feb 2026 (this version, v2)] Title:Safe Reinforcement Learning for Real-World Engine Control Authors:Julian Bedei, Lucas Koch, Kevin Badalian, Alexander Winkler, Patrick Schaber, Jakob Andert View a PDF of the paper titled Safe Reinforcement Learning for Real-World Engine Control, by Julian Bedei and Lucas Koch and Kevin Badalian and Alexander Winkler and Patrick Schaber and Jakob Andert View PDF Abstract:This work introduces a toolchain for applying Reinforcement Learning (RL), specifically the Deep Deterministic Policy Gradient (DDPG) algorithm, in safety-critical real-world environments. As an exemplary application, transient load control is demonstrated on a single-cylinder internal combustion engine testbench in Homogeneous Charge Compression Ignition (HCCI) mode, that offers high thermal efficiency and low emissions. However, HCCI poses challenges for traditional control methods due to its nonlinear, autoregressive, and stochastic nature. RL provides a viable solution, however, safety concerns, such as excessive pressure rise rates, must be addressed when applying to HCCI. A single unsuitable control input can severely damage the engine or cause misfiring and shut down. Additionally, operating limits are not known a priori and must be determined experimentally. To mitigate these risks, real-time safety monitoring based on the k-nearest neighbor algorithm is imple...