[2602.13516] SPILLage: Agentic Oversharing on the Web
Summary
The paper introduces SPILLage, a framework addressing unintentional oversharing by web agents powered by LLMs, highlighting behavioral oversharing as a significant privacy concern.
Why It Matters
As LLM-powered agents increasingly automate tasks online, understanding how they manage user data is crucial. This research reveals that behavioral oversharing is prevalent, which poses privacy risks and challenges the effectiveness of existing mitigation strategies. The findings suggest that improving privacy measures can enhance task success rates, making this study relevant for developers and researchers in AI safety and user privacy.
Key Takeaways
- SPILLage framework characterizes oversharing in web agents along behavioral and content dimensions.
- Behavioral oversharing is five times more prevalent than content oversharing.
- Mitigating task-irrelevant information can improve task success by up to 17.9%.
- Current privacy measures may not adequately address the complexities of agent behavior.
- The study emphasizes the need for a broader understanding of privacy in AI applications.
Computer Science > Artificial Intelligence arXiv:2602.13516 (cs) [Submitted on 13 Feb 2026] Title:SPILLage: Agentic Oversharing on the Web Authors:Jaechul Roh, Eugene Bagdasarian, Hamed Haddadi, Ali Shahin Shamsabadi View a PDF of the paper titled SPILLage: Agentic Oversharing on the Web, by Jaechul Roh and 3 other authors View PDF HTML (experimental) Abstract:LLM-powered agents are beginning to automate user's tasks across the open web, often with access to user resources such as emails and calendars. Unlike standard LLMs answering questions in a controlled ChatBot setting, web agents act "in the wild", interacting with third parties and leaving behind an action trace. Therefore, we ask the question: how do web agents handle user resources when accomplishing tasks on their behalf across live websites? In this paper, we formalize Natural Agentic Oversharing -- the unintentional disclosure of task-irrelevant user information through an agent trace of actions on the web. We introduce SPILLage, a framework that characterizes oversharing along two dimensions: channel (content vs. behavior) and directness (explicit vs. implicit). This taxonomy reveals a critical blind spot: while prior work focuses on text leakage, web agents also overshare behaviorally through clicks, scrolls, and navigation patterns that can be monitored. We benchmark 180 tasks on live e-commerce sites with ground-truth annotations separating task-relevant from task-irrelevant attributes. Across 1,080 runs sp...