Machine Learning Robotics Computer Vision

[2602.21203] Squint: Fast Visual Reinforcement Learning for Sim-to-Real Robotics

arXiv - Machine Learning February 25, 2026 3 min read Article

Summary

The paper presents Squint, a novel visual reinforcement learning method that enhances training efficiency for sim-to-real robotics, achieving faster results than existing methods.

Why It Matters

Squint addresses the challenges of training robots using visual reinforcement learning, which is crucial for advancing robotics applications in real-world environments. By improving training speed and efficiency, it opens new avenues for deploying robotic systems in practical scenarios, potentially revolutionizing industries reliant on automation.

Key Takeaways

Squint utilizes a visual Soft Actor Critic method for faster training.
The method achieves significant efficiency improvements over traditional off-policy and on-policy methods.
It successfully transfers learned policies from simulation to real-world robots.
Training on a single RTX 3090 GPU can converge tasks in under 6 minutes.
Introduces techniques like resolution squinting and a distributional critic to enhance performance.

Computer Science > Robotics arXiv:2602.21203 (cs) [Submitted on 24 Feb 2026] Title:Squint: Fast Visual Reinforcement Learning for Sim-to-Real Robotics Authors:Abdulaziz Almuzairee, Henrik I. Christensen View a PDF of the paper titled Squint: Fast Visual Reinforcement Learning for Sim-to-Real Robotics, by Abdulaziz Almuzairee and Henrik I. Christensen View PDF HTML (experimental) Abstract:Visual reinforcement learning is appealing for robotics but expensive -- off-policy methods are sample-efficient yet slow; on-policy methods parallelize well but waste samples. Recent work has shown that off-policy methods can train faster than on-policy methods in wall-clock time for state-based control. Extending this to vision remains challenging, where high-dimensional input images complicate training dynamics and introduce substantial storage and encoding overhead. To address these challenges, we introduce Squint, a visual Soft Actor Critic method that achieves faster wall-clock training than prior visual off-policy and on-policy methods. Squint achieves this via parallel simulation, a distributional critic, resolution squinting, layer normalization, a tuned update-to-data ratio, and an optimized implementation. We evaluate on the SO-101 Task Set, a new suite of eight manipulation tasks in ManiSkill3 with heavy domain randomization, and demonstrate sim-to-real transfer to a real SO-101 robot. We train policies for 15 minutes on a single RTX 3090 GPU, with most tasks converging in unde...

Read Original Article