[2510.15994] MCP Security Bench (MSB): Benchmarking Attacks Against

[2510.15994] MCP Security Bench (MSB): Benchmarking Attacks Against Model Context Protocol in LLM Agents

arXiv - AI March 25, 2026 4 min read

About this article

Abstract page for arXiv paper 2510.15994: MCP Security Bench (MSB): Benchmarking Attacks Against Model Context Protocol in LLM Agents

Computer Science > Cryptography and Security arXiv:2510.15994 (cs) [Submitted on 14 Oct 2025 (v1), last revised 24 Mar 2026 (this version, v2)] Title:MCP Security Bench (MSB): Benchmarking Attacks Against Model Context Protocol in LLM Agents Authors:Dongsen Zhang, Zekun Li, Xu Luo, Xuannan Liu, Peipei Li, Wenjun Xu View a PDF of the paper titled MCP Security Bench (MSB): Benchmarking Attacks Against Model Context Protocol in LLM Agents, by Dongsen Zhang and 5 other authors View PDF HTML (experimental) Abstract:The Model Context Protocol (MCP) standardizes how large language model (LLM) agents discover, describe, and call external tools. While MCP unlocks broad interoperability, it also enlarges the attack surface by making tools first-class, composable objects with natural-language metadata, and standardized I/O. We present MSB (MCP Security Benchmark), the first end-to-end evaluation suite that systematically measures how well LLM agents resist MCP-specific attacks throughout the full tool-use pipeline: task planning, tool invocation, and response handling. MSB contributes: (1) a taxonomy of 12 attacks including name-collision, preference manipulation, prompt injections embedded in tool descriptions, out-of-scope parameter requests, user-impersonating responses, false-error escalation, tool-transfer, retrieval injection, and mixed attacks; (2) an evaluation harness that executes attacks by running real tools (both benign and malicious) via MCP rather than simulation; and ...

Originally published on March 25, 2026. Curated by AI News.

Llms

🤖 AI News Digest - March 27, 2026

Today's AI news: 1. My minute-by-minute response to the LiteLLM malware attack The article describes a detailed, minute-by-minute respons...

Reddit - Artificial Intelligence · 1 min · 44 minutes ago

Llms

[D] Real-time Student Attention Detection: ResNet vs Facial Landmarks - Which approach for resource-constrained deployment?

I have a problem statement where we are supposed to detect the attention level of student in a classroom, basically output whether he is ...

Reddit - Machine Learning · 1 min · about 1 hour ago

Llms

[D] We audited LoCoMo: 6.4% of the answer key is wrong and the judge accepts up to 63% of intentionally wrong answers

Projects are still submitting new scores on LoCoMo as of March 2026. We audited it and found 6.4% of the answer key is wrong, and the LLM...

Reddit - Machine Learning · 1 min · about 1 hour ago

Llms

[P] ClaudeFormer: Building a Transformer Out of Claudes — Collaboration Request

I'm looking to work with people interested in math, machine learning, or agentic coding, on creating a multi-agent framework to do fronti...

Reddit - Machine Learning · 1 min · about 3 hours ago

[2510.15994] MCP Security Bench (MSB): Benchmarking Attacks Against Model Context Protocol in LLM Agents

About this article

Related Articles

🤖 AI News Digest - March 27, 2026

[D] Real-time Student Attention Detection: ResNet vs Facial Landmarks - Which approach for resource-constrained deployment?

[D] We audited LoCoMo: 6.4% of the answer key is wrong and the judge accepts up to 63% of intentionally wrong answers

[P] ClaudeFormer: Building a Transformer Out of Claudes — Collaboration Request

No comments

Stay updated with AI News