[2511.02605] Adaptive GR(1) Specification Repair for Liveness-Preserving Shielding in Reinforcement Learning
Summary
This paper presents an adaptive shielding framework for reinforcement learning that utilizes GR(1) specifications to ensure safety and liveness by dynamically repairing specifications in response to environmental changes.
Why It Matters
The research addresses the limitations of static shielding methods in reinforcement learning, which can fail under changing conditions. By introducing an adaptive approach, it enhances the robustness and effectiveness of RL agents, ensuring compliance with safety specifications while optimizing performance.
Key Takeaways
- Adaptive shielding improves safety in reinforcement learning by dynamically adjusting to environmental changes.
- The framework employs GR(1) specifications to balance safety and liveness properties effectively.
- Inductive Logic Programming (ILP) is used for online specification repair, enhancing interpretability.
- Case studies demonstrate the superiority of adaptive shields over static controllers in maintaining performance.
- The approach minimizes goal weakening, ensuring that agents remain compliant while optimizing rewards.
Computer Science > Artificial Intelligence arXiv:2511.02605 (cs) [Submitted on 4 Nov 2025 (v1), last revised 20 Feb 2026 (this version, v2)] Title:Adaptive GR(1) Specification Repair for Liveness-Preserving Shielding in Reinforcement Learning Authors:Tiberiu-Andrei Georgescu, Alexander W. Goodall, Dalal Alrajeh, Francesco Belardinelli, Sebastian Uchitel View a PDF of the paper titled Adaptive GR(1) Specification Repair for Liveness-Preserving Shielding in Reinforcement Learning, by Tiberiu-Andrei Georgescu and 4 other authors View PDF HTML (experimental) Abstract:Shielding is widely used to enforce safety in reinforcement learning (RL), ensuring that an agent's actions remain compliant with formal specifications. Classical shielding approaches, however, are often static, in the sense that they assume fixed logical specifications and hand-crafted abstractions. While these static shields provide safety under nominal assumptions, they fail to adapt when environment assumptions are violated. In this paper, we develop an adaptive shielding framework based on based on Generalized Reactivity of rank 1 (GR(1)) specifications, a tractable and expressive fragment of Linear Temporal Logic (LTL) that captures both safety and liveness properties. Our method detects environment assumption violations at runtime and employs Inductive Logic Programming (ILP) to automatically repair GR(1) specifications online, in a systematic and interpretable way. This ensures that the shield evolves grac...