[2503.07885] Safety Guardrails for LLM-Enabled Robots
About this article
Abstract page for arXiv paper 2503.07885: Safety Guardrails for LLM-Enabled Robots
Computer Science > Robotics arXiv:2503.07885 (cs) [Submitted on 10 Mar 2025 (v1), last revised 3 Mar 2026 (this version, v2)] Title:Safety Guardrails for LLM-Enabled Robots Authors:Zachary Ravichandran, Alexander Robey, Vijay Kumar, George J. Pappas, Hamed Hassani View a PDF of the paper titled Safety Guardrails for LLM-Enabled Robots, by Zachary Ravichandran and 4 other authors View PDF HTML (experimental) Abstract:Although the integration of large language models (LLMs) into robotics has unlocked transformative capabilities, it has also introduced significant safety concerns, ranging from average-case LLM errors (e.g., hallucinations) to adversarial jailbreaking attacks, which can produce harmful robot behavior in real-world settings. Traditional robot safety approaches do not address the contextual vulnerabilities of LLMs, and current LLM safety approaches overlook the physical risks posed by robots operating in real-world environments. To ensure the safety of LLM-enabled robots, we propose RoboGuard, a two-stage guardrail architecture. RoboGuard first contextualizes pre-defined safety rules by grounding them in the robot's environment using a root-of-trust LLM. This LLM is shielded from malicious prompts and employs chain-of-thought (CoT) reasoning to generate context-dependent safety specifications, such as temporal logic constraints. RoboGuard then resolves conflicts between these contextual safety specifications and potentially unsafe plans using temporal logic cont...