[2602.17312] LexiSafe: Offline Safe Reinforcement Learning with Lexicographic Safety-Reward Hierarchy

[2602.17312] LexiSafe: Offline Safe Reinforcement Learning with Lexicographic Safety-Reward Hierarchy

arXiv - Machine Learning 3 min read Article

Summary

The paper presents LexiSafe, a novel offline safe reinforcement learning framework that employs a lexicographic safety-reward hierarchy to enhance safety in cyber-physical systems.

Why It Matters

As cyber-physical systems become more prevalent, ensuring safety during reinforcement learning is critical. LexiSafe addresses the limitations of existing methods by providing a structured approach to balance safety and performance, making it highly relevant for researchers and practitioners in AI safety and machine learning.

Key Takeaways

  • LexiSafe introduces a lexicographic framework for offline safe reinforcement learning.
  • The framework includes both single-cost and multi-cost formulations to handle varying safety requirements.
  • Empirical results show LexiSafe reduces safety violations while improving task performance compared to existing methods.

Computer Science > Machine Learning arXiv:2602.17312 (cs) [Submitted on 19 Feb 2026] Title:LexiSafe: Offline Safe Reinforcement Learning with Lexicographic Safety-Reward Hierarchy Authors:Hsin-Jung Yang, Zhanhong Jiang, Prajwal Koirala, Qisai Liu, Cody Fleming, Soumik Sarkar View a PDF of the paper titled LexiSafe: Offline Safe Reinforcement Learning with Lexicographic Safety-Reward Hierarchy, by Hsin-Jung Yang and 5 other authors View PDF HTML (experimental) Abstract:Offline safe reinforcement learning (RL) is increasingly important for cyber-physical systems (CPS), where safety violations during training are unacceptable and only pre-collected data are available. Existing offline safe RL methods typically balance reward-safety tradeoffs through constraint relaxation or joint optimization, but they often lack structural mechanisms to prevent safety drift. We propose LexiSafe, a lexicographic offline RL framework designed to preserve safety-aligned behavior. We first develop LexiSafe-SC, a single-cost formulation for standard offline safe RL, and derive safety-violation and performance-suboptimality bounds that together yield sample-complexity guarantees. We then extend the framework to hierarchical safety requirements with LexiSafe-MC, which supports multiple safety costs and admits its own sample-complexity analysis. Empirically, LexiSafe demonstrates reduced safety violations and improved task performance compared to constrained offline baselines. By unifying lexicograp...

Related Articles

Machine Learning

[R] ICML Anonymized git repos for rebuttal

A number of the papers I'm reviewing for have submitted additional figures and code through anonymized git repos (e.g. https://anonymous....

Reddit - Machine Learning · 1 min ·
Llms

[R] Reference model free behavioral discovery of AudiBench model organisms via Probe-Mediated Adaptive Auditing

Anthropic's AuditBench - 56 Llama 3.3 70B models with planted hidden behaviors - their best agent detects the behaviros 10-13% of the tim...

Reddit - Machine Learning · 1 min ·
UMKC Announces New Master of Science in Artificial Intelligence
Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min ·
Llms

[P] Dante-2B: I'm training a 2.1B bilingual fully open Italian/English LLM from scratch on 2×H200. Phase 1 done — here's what I've built.

The problem If you work with Italian text and local models, you know the pain. Every open-source LLM out there treats Italian as an after...

Reddit - Machine Learning · 1 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime