[2602.15513] Improving MLLMs in Embodied Exploration and Question Answering with Human-Inspired Memory Modeling

[2602.15513] Improving MLLMs in Embodied Exploration and Question Answering with Human-Inspired Memory Modeling

arXiv - AI 3 min read Article

Summary

This paper presents a novel non-parametric memory framework for improving Multimodal Large Language Models (MLLMs) in embodied exploration and question answering, enhancing performance through human-inspired memory modeling.

Why It Matters

The research addresses significant challenges in deploying MLLMs for embodied agents, particularly in dynamic environments. By improving memory modeling, it enhances the efficiency and reasoning capabilities of AI systems, which is crucial for advancing robotics and AI applications.

Key Takeaways

  • Introduces a non-parametric memory framework that separates episodic and semantic memory.
  • Demonstrates state-of-the-art performance improvements in embodied question answering benchmarks.
  • Highlights the importance of episodic memory for exploration efficiency and semantic memory for complex reasoning.
  • Offers a retrieval-first, reasoning-assisted approach that enhances memory reuse.
  • Provides insights into cross-environment generalization capabilities of embodied agents.

Computer Science > Robotics arXiv:2602.15513 (cs) [Submitted on 17 Feb 2026] Title:Improving MLLMs in Embodied Exploration and Question Answering with Human-Inspired Memory Modeling Authors:Ji Li, Jing Xia, Mingyi Li, Shiyan Hu View a PDF of the paper titled Improving MLLMs in Embodied Exploration and Question Answering with Human-Inspired Memory Modeling, by Ji Li and 3 other authors View PDF HTML (experimental) Abstract:Deploying Multimodal Large Language Models as the brain of embodied agents remains challenging, particularly under long-horizon observations and limited context budgets. Existing memory assisted methods often rely on textual summaries, which discard rich visual and spatial details and remain brittle in non-stationary environments. In this work, we propose a non-parametric memory framework that explicitly disentangles episodic and semantic memory for embodied exploration and question answering. Our retrieval-first, reasoning-assisted paradigm recalls episodic experiences via semantic similarity and verifies them through visual reasoning, enabling robust reuse of past observations without rigid geometric alignment. In parallel, we introduce a program-style rule extraction mechanism that converts experiences into structured, reusable semantic memory, facilitating cross-environment generalization. Extensive experiments demonstrate state-of-the-art performance on embodied question answering and exploration benchmarks, yielding a 7.3% gain in LLM-Match and an 1...

Related Articles

Llms

Have Companies Began Adopting Claude Co-Work at an Enterprise Level?

Hi Guys, My company is considering purchasing the Claude Enterprise plan. The main two constraints are: - Being able to block usage of Cl...

Reddit - Artificial Intelligence · 1 min ·
Llms

What I learned about multi-agent coordination running 9 specialized Claude agents

I've been experimenting with multi-agent AI systems and ended up building something more ambitious than I originally planned: a fully ope...

Reddit - Artificial Intelligence · 1 min ·
Llms

[D] The problem with comparing AI memory system benchmarks — different evaluation methods make scores meaningless

I've been reviewing how various AI memory systems evaluate their performance and noticed a fundamental issue with cross-system comparison...

Reddit - Machine Learning · 1 min ·
Shifting to AI model customization is an architectural imperative | MIT Technology Review
Llms

Shifting to AI model customization is an architectural imperative | MIT Technology Review

In the early days of large language models (LLMs), we grew accustomed to massive 10x jumps in reasoning and coding capability with every ...

MIT Technology Review · 6 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime