[2603.18853] Learn for Variation: Variationally Guided AAV Trajectory Learning in Differentiable Environments
About this article
Abstract page for arXiv paper 2603.18853: Learn for Variation: Variationally Guided AAV Trajectory Learning in Differentiable Environments
Electrical Engineering and Systems Science > Systems and Control arXiv:2603.18853 (eess) [Submitted on 19 Mar 2026 (v1), last revised 25 Mar 2026 (this version, v2)] Title:Learn for Variation: Variationally Guided AAV Trajectory Learning in Differentiable Environments Authors:Xiucheng Wang, Zhenye Chen, Nan Cheng View a PDF of the paper titled Learn for Variation: Variationally Guided AAV Trajectory Learning in Differentiable Environments, by Xiucheng Wang and 2 other authors View PDF HTML (experimental) Abstract:Autonomous aerial vehicles (AAVs) empower sixth-generation (6G) Internet-of-Things (IoT) networks through mobility-driven data collection. However, conventional reward-driven reinforcement learning for AAV trajectory planning suffers from severe credit assignment issues and training instability, because sparse scalar rewards fail to capture the long-term and nonlinear effects of sequential movements. To address these challenges, this paper proposes Learn for Variation (L4V), a gradient-informed trajectory learning framework that replaces high-variance scalar reward signals with dense and analytically grounded policy gradients. Particularly, the coupled evolution of AAV kinematics, distance-dependent channel gains, and per-user data-collection progress is first unrolled into an end-to-end differentiable computational graph. Backpropagation through time then serves as a discrete adjoint solver, which propagates exact sensitivities from the cumulative mission objecti...