[2602.02007] Beyond RAG for Agent Memory: Retrieval by Decoupling and Aggregation

[2602.02007] Beyond RAG for Agent Memory: Retrieval by Decoupling and Aggregation

arXiv - AI 4 min read Article

Summary

The paper introduces xMemory, a novel approach to agent memory systems that enhances retrieval by decoupling and aggregating semantic components, improving answer quality and token efficiency in dialogue contexts.

Why It Matters

This research addresses limitations in traditional Retrieval-Augmented Generation (RAG) methods, which often produce redundant information in agent memory systems. By proposing a hierarchical structure for memory retrieval, it aims to enhance the efficiency and relevance of responses in AI-driven dialogues, which is crucial for advancing conversational AI technologies.

Key Takeaways

  • Traditional RAG methods are inadequate for coherent dialogue streams due to redundancy.
  • xMemory utilizes a hierarchical structure to improve retrieval efficiency.
  • The proposed method enhances answer quality and reduces token usage in LLMs.

Computer Science > Computation and Language arXiv:2602.02007 (cs) [Submitted on 2 Feb 2026 (v1), last revised 25 Feb 2026 (this version, v2)] Title:Beyond RAG for Agent Memory: Retrieval by Decoupling and Aggregation Authors:Zhanghao Hu, Qinglin Zhu, Hanqi Yan, Yulan He, Lin Gui View a PDF of the paper titled Beyond RAG for Agent Memory: Retrieval by Decoupling and Aggregation, by Zhanghao Hu and 4 other authors View PDF HTML (experimental) Abstract:Agent memory systems often adopt the standard Retrieval-Augmented Generation (RAG) pipeline, yet its underlying assumptions differ in this setting. RAG targets large, heterogeneous corpora where retrieved passages are diverse, whereas agent memory is a bounded, coherent dialogue stream with highly correlated spans that are often duplicates. Under this shift, fixed top-$k$ similarity retrieval tends to return redundant context, and post-hoc pruning can delete temporally linked prerequisites needed for correct reasoning. We argue retrieval should move beyond similarity matching and instead operate over latent components, following decoupling to aggregation: disentangle memories into semantic components, organise them into a hierarchy, and use this structure to drive retrieval. We propose xMemory, which builds a hierarchy of intact units and maintains a searchable yet faithful high-level node organisation via a sparsity--semantics objective that guides memory split and merge. At inference, xMemory retrieves top-down, selecting a c...

Related Articles

Llms

Is the Mirage Effect a bug, or is it Geometric Reconstruction in action? A framework for why VLMs perform better "hallucinating" than guessing, and what that may tell us about what's really inside these models

Last week, a team from Stanford and UCSF (Asadi, O'Sullivan, Fei-Fei Li, Euan Ashley et al.) dropped two companion papers. The first, MAR...

Reddit - Artificial Intelligence · 1 min ·
The Galaxy S26’s photo app can sloppify your memories | The Verge
Nlp

The Galaxy S26’s photo app can sloppify your memories | The Verge

Samsung’s S26 series offers some new AI photo editing capabilities to transform your photos. But where’s the line between acceptable edit...

The Verge - AI · 8 min ·
Llms

[D] The problem with comparing AI memory system benchmarks — different evaluation methods make scores meaningless

I've been reviewing how various AI memory systems evaluate their performance and noticed a fundamental issue with cross-system comparison...

Reddit - Machine Learning · 1 min ·
Machine Learning

[D] I had an idea, would love your thoughts

What happens that while training an AI during pre training we make it such that if makes "misaligned behaviour" then we just reduce like ...

Reddit - Machine Learning · 1 min ·
More in Nlp: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime