[2602.20958] EKF-Based Depth Camera and Deep Learning Fusion for UAV-Person Distance Estimation and Following in SAR Operations
Summary
This paper presents a novel system that integrates depth camera measurements and deep learning for accurate distance estimation in UAV-assisted search and rescue operations, enhancing safety and tracking capabilities.
Why It Matters
The research addresses critical challenges in search and rescue (SAR) operations, where timely and accurate distance estimation between UAVs and individuals is vital for safety. By leveraging advanced technologies like depth cameras and deep learning, this work contributes to improving operational efficiency and effectiveness in emergency situations.
Key Takeaways
- The system fuses depth camera data with monocular distance estimation for improved tracking.
- Utilizes YOLO-pose for real-time human detection and distance measurement.
- Demonstrates a reduction in distance estimation errors by up to 15.3% in tested scenarios.
- Validates the system against motion capture ground truth data for accuracy.
- Enhances UAV operational capabilities in SAR missions through advanced technology integration.
Computer Science > Robotics arXiv:2602.20958 (cs) [Submitted on 24 Feb 2026] Title:EKF-Based Depth Camera and Deep Learning Fusion for UAV-Person Distance Estimation and Following in SAR Operations Authors:Luka Šiktar, Branimir Ćaran, Bojan Šekoranja, Marko Švaco View a PDF of the paper titled EKF-Based Depth Camera and Deep Learning Fusion for UAV-Person Distance Estimation and Following in SAR Operations, by Luka \v{S}iktar and 2 other authors View PDF HTML (experimental) Abstract:Search and rescue (SAR) operations require rapid responses to save lives or property. Unmanned Aerial Vehicles (UAVs) equipped with vision-based systems support these missions through prior terrain investigation or real-time assistance during the mission itself. Vision-based UAV frameworks aid human search tasks by detecting and recognizing specific individuals, then tracking and following them while maintaining a safe distance. A key safety requirement for UAV following is the accurate estimation of the distance between camera and target object under real-world conditions, achieved by fusing multiple image modalities. UAVs with deep learning-based vision systems offer a new approach to the planning and execution of SAR operations. As part of the system for automatic people detection and face recognition using deep learning, in this paper we present the fusion of depth camera measurements and monocular camera-to-body distance estimation for robust tracking and following. Deep learning-based fil...