[2603.04639] RoboMME: Benchmarking and Understanding Memory for Robotic Generalist Policies
About this article
Abstract page for arXiv paper 2603.04639: RoboMME: Benchmarking and Understanding Memory for Robotic Generalist Policies
Computer Science > Robotics arXiv:2603.04639 (cs) [Submitted on 4 Mar 2026] Title:RoboMME: Benchmarking and Understanding Memory for Robotic Generalist Policies Authors:Yinpei Dai, Hongze Fu, Jayjun Lee, Yuejiang Liu, Haoran Zhang, Jianing Yang, Chelsea Finn, Nima Fazeli, Joyce Chai View a PDF of the paper titled RoboMME: Benchmarking and Understanding Memory for Robotic Generalist Policies, by Yinpei Dai and 8 other authors View PDF HTML (experimental) Abstract:Memory is critical for long-horizon and history-dependent robotic manipulation. Such tasks often involve counting repeated actions or manipulating objects that become temporarily occluded. Recent vision-language-action (VLA) models have begun to incorporate memory mechanisms; however, their evaluations remain confined to narrow, non-standardized settings. This limits their systematic understanding, comparison, and progress measurement. To address these challenges, we introduce RoboMME: a large-scale standardized benchmark for evaluating and advancing VLA models in long-horizon, history-dependent scenarios. Our benchmark comprises 16 manipulation tasks constructed under a carefully designed taxonomy that evaluates temporal, spatial, object, and procedural memory. We further develop a suite of 14 memory-augmented VLA variants built on the {\pi}0.5 backbone to systematically explore different memory representations across multiple integration strategies. Experimental results show that the effectiveness of memory repre...