[2502.20326] Deep Reinforcement Learning based Autonomous Decision-Making for Cooperative UAVs: A Search and Rescue Real World Application
Summary
This paper presents a novel framework for autonomous decision-making in UAVs during search-and-rescue operations, demonstrating effective navigation in GNSS-denied environments.
Why It Matters
The research addresses critical challenges in search-and-rescue missions, particularly in environments where GPS is unreliable. By integrating advanced AI techniques, this work enhances the operational efficiency and safety of UAVs, potentially improving emergency response outcomes.
Key Takeaways
- Introduces an end-to-end framework for UAVs in search-and-rescue.
- Utilizes a Twin Delayed Deep Deterministic Policy Gradient controller for improved trajectory planning.
- Employs a deep Graph Attention Network for efficient task allocation among drones.
- Achieves centimeter-level altitude stability using a novel sensor fusion approach.
- Demonstrates successful real-world application with first-place results in a competitive environment.
Computer Science > Robotics arXiv:2502.20326 (cs) [Submitted on 27 Feb 2025 (v1), last revised 16 Feb 2026 (this version, v2)] Title:Deep Reinforcement Learning based Autonomous Decision-Making for Cooperative UAVs: A Search and Rescue Real World Application Authors:Thomas Hickling, Maxwell Hogan, Abdulla Tammam, Nabil Aouf View a PDF of the paper titled Deep Reinforcement Learning based Autonomous Decision-Making for Cooperative UAVs: A Search and Rescue Real World Application, by Thomas Hickling and 3 other authors View PDF HTML (experimental) Abstract:This paper presents the first end-to-end framework that combines guidance, navigation, and centralised task allocation for multiple UAVs performing autonomous search-and-rescue (SAR) in GNSS-denied indoor environments. A Twin Delayed Deep Deterministic Policy Gradient controller is trained with an Artificial Potential Field (APF) reward that blends attractive and repulsive potentials with continuous control, accelerating convergence and yielding smoother, safer trajectories than distance-only baselines. Collaborative mission assignment is solved by a deep Graph Attention Network that, at each decision step, reasons over the drone-task graph to produce near-optimal allocations with negligible on-board compute. To arrest the notorious Z-drift of indoor LiDAR-SLAM, we fuse depth-camera altimetry with IMU vertical velocity in a lightweight complementary filter, giving centimetre-level altitude stability without external beacon...