[2603.20320] The Causal Impact of Tool Affordance on Safety Alignment in LLM Agents
About this article
Abstract page for arXiv paper 2603.20320: The Causal Impact of Tool Affordance on Safety Alignment in LLM Agents
Computer Science > Software Engineering arXiv:2603.20320 (cs) [Submitted on 19 Mar 2026] Title:The Causal Impact of Tool Affordance on Safety Alignment in LLM Agents Authors:Shasha Yu, Fiona Carroll, Barry L. Bentley View a PDF of the paper titled The Causal Impact of Tool Affordance on Safety Alignment in LLM Agents, by Shasha Yu and 2 other authors View PDF Abstract:Large language models (LLMs) are increasingly deployed as agents with access to executable tools, enabling direct interaction with external systems. However, most safety evaluations remain text-centric and assume that compliant language implies safe behavior, an assumption that becomes unreliable once models are allowed to act. In this work, we empirically examine how executable tool affordance alters safety alignment in LLM agents using a paired evaluation framework that compares text-only chatbot behavior with tool-enabled agent behavior under identical prompts and policies. Experiments are conducted in a deterministic financial transaction environment with binary safety constraints across 1,500 procedurally generated scenarios. To separate intent from outcome, we distinguish between attempted and realized violations using dual enforcement regimes that either block or permit unsafe actions. Both evaluated models maintain perfect compliance in text-only settings, yet exhibit sharp increases in violations after tool access is introduced, reaching rates up to 85% despite unchanged rules. We observe substantial...