Ai Infrastructure Ai Safety Robotics Machine Learning

[2509.16650] Safe and Near-Optimal Control with Online Dynamics Learning

arXiv - Machine Learning February 24, 2026 4 min read Article

Summary

This article presents a novel approach to safe and near-optimal control in dynamic environments, utilizing online dynamics learning to ensure safety while maximizing performance.

Why It Matters

The research addresses critical challenges in deploying autonomous agents in real-world scenarios where safety and optimality are paramount. By introducing a framework that balances exploration and safety, it contributes to advancements in robotics and control systems, potentially impacting industries like autonomous driving and drone navigation.

Key Takeaways

Introduces maximum safe dynamics learning for optimal control.
Ensures safety during online learning without requiring resets.
Demonstrates effectiveness in complex scenarios like autonomous car racing.
Operates in a non-episodic setting, differing from traditional reinforcement learning.
Achieves close-to-optimal performance with minimal necessary dynamics learning.

Electrical Engineering and Systems Science > Systems and Control arXiv:2509.16650 (eess) [Submitted on 20 Sep 2025 (v1), last revised 21 Feb 2026 (this version, v2)] Title:Safe and Near-Optimal Control with Online Dynamics Learning Authors:Manish Prajapat, Johannes Köhler, Melanie N. Zeilinger, Andreas Krause View a PDF of the paper titled Safe and Near-Optimal Control with Online Dynamics Learning, by Manish Prajapat and 3 other authors View PDF Abstract:Achieving both optimality and safety under unknown system dynamics is a central challenge in real-world deployment of agents. To address this, we introduce a notion of maximum safe dynamics learning, where sufficient exploration is performed within the space of safe policies. Our method executes $\textit{pessimistically}$ safe policies while $\textit{optimistically}$ exploring informative states and, despite not reaching them due to model uncertainty, ensures continuous online learning of dynamics. The framework achieves first-of-its-kind results: learning the dynamics model sufficiently $-$ up to an arbitrary small tolerance (subject to noise) $-$ in a finite time, while ensuring provably safe operation throughout with high probability and without requiring resets. Building on this, we propose an algorithm to maximize rewards while learning the dynamics $\textit{only to the extent needed}$ to achieve close-to-optimal performance. Unlike typical reinforcement learning (RL) methods, our approach operates online in a non-ep...

Read Original Article

[2509.16650] Safe and Near-Optimal Control with Online Dynamics Learning

Summary

Why It Matters

Key Takeaways

Related Articles

[P] fastrad: GPU-native radiomics library — 25× faster than PyRadiomics, 100% IBSI-compliant, all 8 feature classes

World models will be the next big thing, bye-bye LLMs

What tools are sr MLEs using? (clawdbot, openspec, wispr) [D]

[R] An attack class that passes every current LLM filter - no payload, no injection signature, no log trace

No comments

Stay updated with AI News