[2603.04639] RoboMME: Benchmarking and Understanding Memory for

[2603.04639] RoboMME: Benchmarking and Understanding Memory for Robotic Generalist Policies

arXiv - AI March 06, 2026 3 min read

About this article

Abstract page for arXiv paper 2603.04639: RoboMME: Benchmarking and Understanding Memory for Robotic Generalist Policies

Computer Science > Robotics arXiv:2603.04639 (cs) [Submitted on 4 Mar 2026] Title:RoboMME: Benchmarking and Understanding Memory for Robotic Generalist Policies Authors:Yinpei Dai, Hongze Fu, Jayjun Lee, Yuejiang Liu, Haoran Zhang, Jianing Yang, Chelsea Finn, Nima Fazeli, Joyce Chai View a PDF of the paper titled RoboMME: Benchmarking and Understanding Memory for Robotic Generalist Policies, by Yinpei Dai and 8 other authors View PDF HTML (experimental) Abstract:Memory is critical for long-horizon and history-dependent robotic manipulation. Such tasks often involve counting repeated actions or manipulating objects that become temporarily occluded. Recent vision-language-action (VLA) models have begun to incorporate memory mechanisms; however, their evaluations remain confined to narrow, non-standardized settings. This limits their systematic understanding, comparison, and progress measurement. To address these challenges, we introduce RoboMME: a large-scale standardized benchmark for evaluating and advancing VLA models in long-horizon, history-dependent scenarios. Our benchmark comprises 16 manipulation tasks constructed under a carefully designed taxonomy that evaluates temporal, spatial, object, and procedural memory. We further develop a suite of 14 memory-augmented VLA variants built on the {\pi}0.5 backbone to systematically explore different memory representations across multiple integration strategies. Experimental results show that the effectiveness of memory repre...

Originally published on March 06, 2026. Curated by AI News.

Llms

[P] I built an autonomous ML agent that runs experiments on tabular data indefinitely - inspired by Karpathy's AutoResearch

Inspired by Andrej Karpathy's AutoResearch, I built a system where Claude Code acts as an autonomous ML researcher on tabular binary clas...

Reddit - Machine Learning · 1 min · 34 minutes ago

Machine Learning

[D] Data curation and targeted replacement as a pre-training alignment and controllability method

Hi, r/MachineLearning: has much research been done in large-scale training scenarios where undesirable data has been replaced before trai...

Reddit - Machine Learning · 1 min · 34 minutes ago

Llms

[R] BraiNN: An Experimental Neural Architecture with Working Memory, Relational Reasoning, and Adaptive Learning

BraiNN An Experimental Neural Architecture with Working Memory, Relational Reasoning, and Adaptive Learning BraiNN is a compact research‑...

Reddit - Machine Learning · 1 min · about 2 hours ago

Machine Learning

[HIRING]Remote AI Training Jobs -Up to $1K/Week| Collaborators Wanted.USA

submitted by /u/nortonakenga [link] [comments]

Reddit - ML Jobs · 1 min · about 3 hours ago

[2603.04639] RoboMME: Benchmarking and Understanding Memory for Robotic Generalist Policies

About this article

Related Articles

[P] I built an autonomous ML agent that runs experiments on tabular data indefinitely - inspired by Karpathy's AutoResearch

[D] Data curation and targeted replacement as a pre-training alignment and controllability method

[R] BraiNN: An Experimental Neural Architecture with Working Memory, Relational Reasoning, and Adaptive Learning

[HIRING]Remote AI Training Jobs -Up to $1K/Week| Collaborators Wanted.USA

No comments

Stay updated with AI News