[2601.09605] Sim2real Image Translation Enables Viewpoint-Robust Policies from Fixed-Camera Datasets

[2601.09605] Sim2real Image Translation Enables Viewpoint-Robust Policies from Fixed-Camera Datasets

arXiv - AI 4 min read Article

Summary

The paper presents MANGO, a novel image translation method that enhances viewpoint robustness in robot manipulation policies using fixed-camera datasets, outperforming existing methods.

Why It Matters

As robotics increasingly relies on vision-based policies, addressing the challenges posed by varying camera viewpoints is crucial. MANGO's approach allows for effective training with limited real-world data, improving the adaptability of robotic systems in diverse environments. This research contributes to advancing robotics and AI by enabling more reliable and versatile manipulation capabilities.

Key Takeaways

  • MANGO employs a segmentation-conditioned InfoNCE loss to enhance image translation.
  • The method significantly improves success rates in real-world manipulation tasks by over 40 percentage points.
  • MANGO requires only a small amount of real-world data to generate diverse viewpoints.
  • The approach addresses the sim2real challenge effectively, bridging the gap between simulated and real-world data.
  • This research highlights the importance of viewpoint consistency in training robust robotic policies.

Computer Science > Computer Vision and Pattern Recognition arXiv:2601.09605 (cs) [Submitted on 14 Jan 2026 (v1), last revised 13 Feb 2026 (this version, v3)] Title:Sim2real Image Translation Enables Viewpoint-Robust Policies from Fixed-Camera Datasets Authors:Jeremiah Coholich, Justin Wit, Robert Azarcon, Zsolt Kira View a PDF of the paper titled Sim2real Image Translation Enables Viewpoint-Robust Policies from Fixed-Camera Datasets, by Jeremiah Coholich and 3 other authors View PDF HTML (experimental) Abstract:Vision-based policies for robot manipulation have achieved significant recent success, but are still brittle to distribution shifts such as camera viewpoint variations. Robot demonstration data is scarce and often lacks appropriate variation in camera viewpoints. Simulation offers a way to collect robot demonstrations at scale with comprehensive coverage of different viewpoints, but presents a visual sim2real challenge. To bridge this gap, we propose MANGO -- an unpaired image translation method with a novel segmentation-conditioned InfoNCE loss, a highly-regularized discriminator design, and a modified PatchNCE loss. We find that these elements are crucial for maintaining viewpoint consistency during sim2real translation. When training MANGO, we only require a small amount of fixed-camera data from the real world, but show that our method can generate diverse unseen viewpoints by translating simulated observations. In this setting, MANGO outperforms all other image...

Related Articles

The Galaxy S26’s photo app can sloppify your memories | The Verge
Nlp

The Galaxy S26’s photo app can sloppify your memories | The Verge

Samsung’s S26 series offers some new AI photo editing capabilities to transform your photos. But where’s the line between acceptable edit...

The Verge - AI · 8 min ·
Llms

[D] The problem with comparing AI memory system benchmarks — different evaluation methods make scores meaningless

I've been reviewing how various AI memory systems evaluate their performance and noticed a fundamental issue with cross-system comparison...

Reddit - Machine Learning · 1 min ·
Machine Learning

[D] I had an idea, would love your thoughts

What happens that while training an AI during pre training we make it such that if makes "misaligned behaviour" then we just reduce like ...

Reddit - Machine Learning · 1 min ·
Machine Learning

I had an idea, would love your thoughts

What happens that while training an AI during pre training we make it such that if makes "misaligned behaviour" then we just reduce like ...

Reddit - Artificial Intelligence · 1 min ·
More in Nlp: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime