[2603.17470] VirPro: Visual-referred Probabilistic Prompt Learning for Weakly-Supervised Monocular 3D Detection
About this article
Abstract page for arXiv paper 2603.17470: VirPro: Visual-referred Probabilistic Prompt Learning for Weakly-Supervised Monocular 3D Detection
Computer Science > Computer Vision and Pattern Recognition arXiv:2603.17470 (cs) [Submitted on 18 Mar 2026 (v1), last revised 20 Mar 2026 (this version, v2)] Title:VirPro: Visual-referred Probabilistic Prompt Learning for Weakly-Supervised Monocular 3D Detection Authors:Chupeng Liu, Jiyong Rao, Shangquan Sun, Runkai Zhao, Weidong Cai View a PDF of the paper titled VirPro: Visual-referred Probabilistic Prompt Learning for Weakly-Supervised Monocular 3D Detection, by Chupeng Liu and 4 other authors View PDF HTML (experimental) Abstract:Monocular 3D object detection typically relies on pseudo-labeling techniques to reduce dependency on real-world annotations. Recent advances demonstrate that deterministic linguistic cues can serve as effective auxiliary weak supervision signals, providing complementary semantic context. However, hand-crafted textual descriptions struggle to capture the inherent visual diversity of individuals across scenes, limiting the model's ability to learn scene-aware representations. To address this challenge, we propose Visual-referred Probabilistic Prompt Learning (VirPro), an adaptive multi-modal pretraining paradigm that can be seamlessly integrated into diverse weakly supervised monocular 3D detection frameworks. Specifically, we generate a diverse set of learnable, instance-conditioned prompts across scenes and store them in an Adaptive Prompt Bank (APB). Subsequently, we introduce Multi-Gaussian Prompt Modeling (MGPM), which incorporates scene-ba...