[2508.01423] 3DRot: Rediscovering the Missing Primitive for RGB-Based 3D Augmentation
Summary
The paper introduces 3DRot, a novel RGB-based 3D augmentation technique that enhances geometric consistency in 3D tasks by enabling effective rotations and reflections without requiring scene depth.
Why It Matters
3DRot addresses a significant gap in RGB-based 3D augmentation by providing a method that maintains geometric integrity during image transformations. This advancement can improve the performance of various 3D tasks, such as detection and depth estimation, which are crucial in fields like robotics and computer vision.
Key Takeaways
- 3DRot allows for geometry-consistent rotations and reflections in RGB images.
- The technique improves performance metrics in 3D detection and depth estimation tasks.
- 3DRot is compatible with existing 3D augmentation methods, enhancing their effectiveness.
Computer Science > Computer Vision and Pattern Recognition arXiv:2508.01423 (cs) [Submitted on 2 Aug 2025 (v1), last revised 16 Feb 2026 (this version, v3)] Title:3DRot: Rediscovering the Missing Primitive for RGB-Based 3D Augmentation Authors:Shitian Yang, Deyu Li, Xiaoke Jiang, Lei Zhang View a PDF of the paper titled 3DRot: Rediscovering the Missing Primitive for RGB-Based 3D Augmentation, by Shitian Yang and 3 other authors View PDF HTML (experimental) Abstract:RGB-based 3D tasks, e.g., 3D detection, depth estimation, 3D keypoint estimation, still suffer from scarce, expensive annotations and a thin augmentation toolbox, since many image transforms, including rotations and warps, disrupt geometric consistency. While horizontal flipping and color jitter are standard, rigorous 3D rotation augmentation has surprisingly remained absent from RGB-based pipelines, largely due to the misconception that it requires scene depth or scene reconstruction. In this paper, we introduce 3DRot, a plug-and-play augmentation that rotates and mirrors images about the camera's optical center while synchronously updating RGB images, camera intrinsics, object poses, and 3D annotations to preserve projective geometry, achieving geometry-consistent rotations and reflections without relying on any scene depth. We first validate 3DRot on a classical RGB-based 3D task, monocular 3D detection. On SUN RGB-D, inserting 3DRot into a frozen DINO-X + Cube R-CNN pipeline raises $IoU_{3D}$ from 43.21 to 4...