[2602.15010] BPP: Long-Context Robot Imitation Learning by Focusing on Key History Frames
Summary
The paper presents Big Picture Policies (BPP), a novel approach to robot imitation learning that enhances performance by focusing on key historical frames, addressing limitations of traditional methods that rely solely on current observations.
Why It Matters
This research is significant as it tackles the challenge of conditioning robot policies on historical data, which is crucial for tasks requiring memory of past actions. By improving the success rates of robotic tasks, it contributes to advancements in robotics and machine learning, potentially leading to more effective and adaptable robotic systems in real-world applications.
Key Takeaways
- BPP improves robot imitation learning by focusing on keyframes from historical data.
- Traditional methods often fail due to spurious correlations in training data.
- BPP achieves a 70% higher success rate in real-world tasks compared to existing methods.
- The approach reduces distribution shifts between training and deployment.
- Keyframes are detected using a vision-language model to enhance task relevance.
Computer Science > Robotics arXiv:2602.15010 (cs) [Submitted on 16 Feb 2026] Title:BPP: Long-Context Robot Imitation Learning by Focusing on Key History Frames Authors:Max Sobol Mark, Jacky Liang, Maria Attarian, Chuyuan Fu, Debidatta Dwibedi, Dhruv Shah, Aviral Kumar View a PDF of the paper titled BPP: Long-Context Robot Imitation Learning by Focusing on Key History Frames, by Max Sobol Mark and 6 other authors View PDF HTML (experimental) Abstract:Many robot tasks require attending to the history of past observations. For example, finding an item in a room requires remembering which places have already been searched. However, the best-performing robot policies typically condition only on the current observation, limiting their applicability to such tasks. Naively conditioning on past observations often fails due to spurious correlations: policies latch onto incidental features of training histories that do not generalize to out-of-distribution trajectories upon deployment. We analyze why policies latch onto these spurious correlations and find that this problem stems from limited coverage over the space of possible histories during training, which grows exponentially with horizon. Existing regularization techniques provide inconsistent benefits across tasks, as they do not fundamentally address this coverage problem. Motivated by these findings, we propose Big Picture Policies (BPP), an approach that conditions on a minimal set of meaningful keyframes detected by a visio...