[2504.13647] An Efficient LiDAR-Camera Fusion Network for Multi-Class 3D Dynamic Object Detection and Trajectory Prediction
Summary
The paper presents a novel LiDAR-camera fusion framework for real-time 3D dynamic object detection and trajectory prediction, enhancing service robots' capabilities in complex environments.
Why It Matters
This research addresses the critical need for efficient perception systems in robotics, particularly for mobile service robots operating in dynamic settings. By integrating LiDAR and camera data, the proposed method improves detection accuracy and trajectory prediction, which are essential for safe navigation and interaction with moving objects.
Key Takeaways
- Introduces a multi-modal framework combining LiDAR and camera inputs for enhanced 3D object detection.
- Achieves significant improvements in detection accuracy and trajectory prediction metrics over existing methods.
- Demonstrates real-time performance on entry-level hardware, making it practical for deployment in service robots.
Computer Science > Robotics arXiv:2504.13647 (cs) [Submitted on 18 Apr 2025 (v1), last revised 24 Feb 2026 (this version, v2)] Title:An Efficient LiDAR-Camera Fusion Network for Multi-Class 3D Dynamic Object Detection and Trajectory Prediction Authors:Yushen He, Lei Zhao, Tianchen Deng, Zipeng Fang, Weidong Chen View a PDF of the paper titled An Efficient LiDAR-Camera Fusion Network for Multi-Class 3D Dynamic Object Detection and Trajectory Prediction, by Yushen He and 4 other authors View PDF HTML (experimental) Abstract:Service mobile robots are often required to avoid dynamic objects while performing their tasks, but they usually have only limited computational resources. To further advance the practical application of service robots in complex dynamic environments, we propose an efficient multi-modal framework for 3D object detection and trajectory prediction, which synergistically integrates LiDAR and camera inputs to achieve real-time perception of pedestrians, vehicles, and riders in 3D this http URL framework incorporates two novel models: 1) a Unified modality detector with Mamba and Transformer (UniMT) for object detection, which achieves high-accuracy object detection with fast inference speed, and 2) a Reference Trajectory-based Multi-Class Transformer (RTMCT) for efficient and diverse trajectory prediction of multi-class objects with flexible-length trajectories. Evaluations on the CODa benchmark demonstrate that our method outperforms existing ones in both de...