Llms Machine Learning Nlp Ai Safety

[2602.15344] ER-MIA: Black-Box Adversarial Memory Injection Attacks on Long-Term Memory-Augmented Large Language Models

arXiv - Machine Learning February 18, 2026 3 min read Article

Summary

The paper presents ER-MIA, a framework for black-box adversarial memory injection attacks on long-term memory-augmented large language models, highlighting significant vulnerabilities in their retrieval mechanisms.

Why It Matters

As large language models increasingly integrate long-term memory systems, understanding their vulnerabilities is crucial for enhancing AI safety. This research exposes critical security risks that could be exploited, informing developers and researchers about potential attack vectors and the need for improved defenses.

Key Takeaways

ER-MIA framework identifies vulnerabilities in memory-augmented LLMs.
Two attack settings are formalized: content-based and question-targeted.
Similarity-based retrieval mechanisms pose significant security risks.
Extensive experiments reveal persistent vulnerabilities across various LLMs.
The findings highlight the need for enhanced security measures in AI systems.

Computer Science > Machine Learning arXiv:2602.15344 (cs) [Submitted on 17 Feb 2026] Title:ER-MIA: Black-Box Adversarial Memory Injection Attacks on Long-Term Memory-Augmented Large Language Models Authors:Mitchell Piehl, Zhaohan Xi, Zuobin Xiong, Pan He, Muchao Ye View a PDF of the paper titled ER-MIA: Black-Box Adversarial Memory Injection Attacks on Long-Term Memory-Augmented Large Language Models, by Mitchell Piehl and 4 other authors View PDF HTML (experimental) Abstract:Large language models (LLMs) are increasingly augmented with long-term memory systems to overcome finite context windows and enable persistent reasoning across interactions. However, recent research finds that LLMs become more vulnerable because memory provides extra attack surfaces. In this paper, we present the first systematic study of black-box adversarial memory injection attacks that target the similarity-based retrieval mechanism in long-term memory-augmented LLMs. We introduce ER-MIA, a unified framework that exposes this vulnerability and formalizes two realistic attack settings: content-based attacks and question-targeted attacks. In these settings, ER-MIA includes an arsenal of composable attack primitives and ensemble attacks that achieve high success rates under minimal attacker assumptions. Extensive experiments across multiple LLMs and long-term memory systems demonstrate that similarity-based retrieval constitutes a fundamental and system-level vulnerability, revealing security risks t...

Read Original Article

Llms

[R] Reference model free behavioral discovery of AudiBench model organisms via Probe-Mediated Adaptive Auditing

Anthropic's AuditBench - 56 Llama 3.3 70B models with planted hidden behaviors - their best agent detects the behaviros 10-13% of the tim...

Reddit - Machine Learning · 1 min · about 1 hour ago

Llms

[P] Dante-2B: I'm training a 2.1B bilingual fully open Italian/English LLM from scratch on 2×H200. Phase 1 done — here's what I've built.

The problem If you work with Italian text and local models, you know the pain. Every open-source LLM out there treats Italian as an after...

Reddit - Machine Learning · 1 min · about 3 hours ago

Llms

I have been coding for 11 years and I caught myself completely unable to debug a problem without AI assistance last month. That scared me more than anything I have seen in this industry.

I want to be honest about something that happened to me because I think it is more common than people admit. Last month I hit a bug in a ...

Reddit - Artificial Intelligence · 1 min · about 3 hours ago

Llms

OpenClaw security checklist: practical safeguards for AI agents

Here is one of the better quality guides on the ensuring safety when deploying OpenClaw: https://chatgptguide.ai/openclaw-security-checkl...

Reddit - Artificial Intelligence · 1 min · about 10 hours ago

[2602.15344] ER-MIA: Black-Box Adversarial Memory Injection Attacks on Long-Term Memory-Augmented Large Language Models

Summary

Why It Matters

Key Takeaways

Related Articles

[R] Reference model free behavioral discovery of AudiBench model organisms via Probe-Mediated Adaptive Auditing

[P] Dante-2B: I'm training a 2.1B bilingual fully open Italian/English LLM from scratch on 2×H200. Phase 1 done — here's what I've built.

I have been coding for 11 years and I caught myself completely unable to debug a problem without AI assistance last month. That scared me more than anything I have seen in this industry.

OpenClaw security checklist: practical safeguards for AI agents

No comments

Stay updated with AI News