[2508.07112] AugLift: Depth-Aware Input Reparameterization Improves Domain Generalization in 2D-to-3D Pose Lifting
About this article
Abstract page for arXiv paper 2508.07112: AugLift: Depth-Aware Input Reparameterization Improves Domain Generalization in 2D-to-3D Pose Lifting
Computer Science > Computer Vision and Pattern Recognition arXiv:2508.07112 (cs) [Submitted on 9 Aug 2025 (v1), last revised 7 Apr 2026 (this version, v4)] Title:AugLift: Depth-Aware Input Reparameterization Improves Domain Generalization in 2D-to-3D Pose Lifting Authors:Nikolai Warner, Wenjin Zhang, Hamid Badiozamani, Irfan Essa, Apaar Sadhwani View a PDF of the paper titled AugLift: Depth-Aware Input Reparameterization Improves Domain Generalization in 2D-to-3D Pose Lifting, by Nikolai Warner and 4 other authors View PDF HTML (experimental) Abstract:Lifting-based 3D human pose estimation infers 3D joints from 2D keypoints but generalizes poorly because $(x,y)$ coordinates alone are an ill-posed, sparse representation that discards geometric information modern foundation models can recover. We propose \emph{AugLift}, which changes the representation format of lifting from 2D coordinates to a 6D geometric descriptor via two modules: (1) an \emph{Uncertainty-Aware Depth Descriptor} (UADD) -- a compact tuple $(c, d, d_{\min}, d_{\max})$ extracted from a confidence-scaled neighborhood of an off-the-shelf monocular depth map -- and (2) a scale normalization component that handles train/test distance shifts. AugLift requires no new sensors, no new data collection, and no architectural changes beyond widening the input layer; because it operates at the representation level, it is composable with any lifting architecture or domain generalization technique. In the detection settin...