[2602.20064] The LLMbda Calculus: AI Agents, Conversations, and Information Flow
Summary
The LLMbda Calculus introduces a formal framework for understanding AI agents' conversations, addressing vulnerabilities like prompt injection and providing safety guarantees.
Why It Matters
As AI agents increasingly engage in complex interactions, understanding their operational semantics is crucial for ensuring safety and integrity. This paper lays the groundwork for developing secure AI systems by formalizing the conversation process and identifying potential attack vectors.
Key Takeaways
- Introduces a lambda calculus for modeling AI conversations.
- Highlights vulnerabilities related to prompt injection in AI systems.
- Establishes safety guarantees through formal semantics.
- Proposes defenses like quarantined sub-conversations.
- Addresses the need for a principled approach to AI agent programming.
Computer Science > Programming Languages arXiv:2602.20064 (cs) [Submitted on 23 Feb 2026] Title:The LLMbda Calculus: AI Agents, Conversations, and Information Flow Authors:Zac Garby, Andrew D. Gordon, David Sands View a PDF of the paper titled The LLMbda Calculus: AI Agents, Conversations, and Information Flow, by Zac Garby and Andrew D. Gordon and David Sands View PDF Abstract:A conversation with a large language model (LLM) is a sequence of prompts and responses, with each response generated from the preceding conversation. AI agents build such conversations automatically: given an initial human prompt, a planner loop interleaves LLM calls with tool invocations and code execution. This tight coupling creates a new and poorly understood attack surface. A malicious prompt injected into a conversation can compromise later reasoning, trigger dangerous tool calls, or distort final outputs. Despite the centrality of such systems, we currently lack a principled semantic foundation for reasoning about their behaviour and safety. We address this gap by introducing an untyped call-by-value lambda calculus enriched with dynamic information-flow control and a small number of primitives for constructing prompt-response conversations. Our language includes a primitive that invokes an LLM: it serializes a value, sends it to the model as a prompt, and parses the response as a new term. This calculus faithfully represents planner loops and their vulnerabilities, including the mechanisms ...