[2411.02317] Defining and Evaluating Physical Safety for Large Language Models

[2411.02317] Defining and Evaluating Physical Safety for Large Language Models

arXiv - AI 4 min read Article

Summary

This paper explores the physical safety of Large Language Models (LLMs) in controlling robotic systems, identifying risks and proposing a benchmark for evaluation.

Why It Matters

As LLMs are increasingly integrated into robotics, understanding their safety implications is crucial. This research addresses a significant gap in evaluating the risks associated with LLMs, particularly in real-world applications where physical harm could occur. The findings can inform safer designs and regulatory frameworks for AI technologies.

Key Takeaways

  • LLMs can pose physical safety risks when controlling drones.
  • Four categories of threats are identified: human-targeted, object-targeted, infrastructure attacks, and regulatory violations.
  • There is a trade-off between utility and safety in LLM performance.
  • Advanced techniques like In-Context Learning improve safety but are not foolproof.
  • Larger models tend to demonstrate better safety capabilities.

Computer Science > Machine Learning arXiv:2411.02317 (cs) [Submitted on 4 Nov 2024 (v1), last revised 19 Feb 2026 (this version, v2)] Title:Defining and Evaluating Physical Safety for Large Language Models Authors:Yung-Chen Tang, Pin-Yu Chen, Tsung-Yi Ho View a PDF of the paper titled Defining and Evaluating Physical Safety for Large Language Models, by Yung-Chen Tang and 2 other authors View PDF Abstract:Large Language Models (LLMs) are increasingly used to control robotic systems such as drones, but their risks of causing physical threats and harm in real-world applications remain unexplored. Our study addresses the critical gap in evaluating LLM physical safety by developing a comprehensive benchmark for drone control. We classify the physical safety risks of drones into four categories: (1) human-targeted threats, (2) object-targeted threats, (3) infrastructure attacks, and (4) regulatory violations. Our evaluation of mainstream LLMs reveals an undesirable trade-off between utility and safety, with models that excel in code generation often performing poorly in crucial safety aspects. Furthermore, while incorporating advanced prompt engineering techniques such as In-Context Learning and Chain-of-Thought can improve safety, these methods still struggle to identify unintentional attacks. In addition, larger models demonstrate better safety capabilities, particularly in refusing dangerous commands. Our findings and benchmark can facilitate the design and evaluation of phy...

Related Articles

[2603.18532] Scaling Sim-to-Real Reinforcement Learning for Robot VLAs with Generative 3D Worlds
Llms

[2603.18532] Scaling Sim-to-Real Reinforcement Learning for Robot VLAs with Generative 3D Worlds

Abstract page for arXiv paper 2603.18532: Scaling Sim-to-Real Reinforcement Learning for Robot VLAs with Generative 3D Worlds

arXiv - Machine Learning · 4 min ·
[2603.12702] FGTR: Fine-Grained Multi-Table Retrieval via Hierarchical LLM Reasoning
Llms

[2603.12702] FGTR: Fine-Grained Multi-Table Retrieval via Hierarchical LLM Reasoning

Abstract page for arXiv paper 2603.12702: FGTR: Fine-Grained Multi-Table Retrieval via Hierarchical LLM Reasoning

arXiv - Machine Learning · 4 min ·
[2603.12681] Colluding LoRA: A Compositional Vulnerability in LLM Safety Alignment
Llms

[2603.12681] Colluding LoRA: A Compositional Vulnerability in LLM Safety Alignment

Abstract page for arXiv paper 2603.12681: Colluding LoRA: A Compositional Vulnerability in LLM Safety Alignment

arXiv - Machine Learning · 3 min ·
[2602.06098] A Theoretical Analysis of Test-Driven LLM Code Generation
Llms

[2602.06098] A Theoretical Analysis of Test-Driven LLM Code Generation

Abstract page for arXiv paper 2602.06098: A Theoretical Analysis of Test-Driven LLM Code Generation

arXiv - Machine Learning · 3 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime