[2508.02046] NaviMaster: Learning a Unified Policy for GUI and Embodied Navigation Tasks
About this article
Abstract page for arXiv paper 2508.02046: NaviMaster: Learning a Unified Policy for GUI and Embodied Navigation Tasks
Computer Science > Robotics arXiv:2508.02046 (cs) [Submitted on 4 Aug 2025 (v1), last revised 25 Mar 2026 (this version, v3)] Title:NaviMaster: Learning a Unified Policy for GUI and Embodied Navigation Tasks Authors:Zhihao Luo, Wentao Yan, Jingyu Gong, Min Wang, Zhizhong Zhang, Xuhong Wang, Yuan Xie, Xin Tan View a PDF of the paper titled NaviMaster: Learning a Unified Policy for GUI and Embodied Navigation Tasks, by Zhihao Luo and Wentao Yan and Jingyu Gong and Min Wang and Zhizhong Zhang and Xuhong Wang and Yuan Xie and Xin Tan View PDF HTML (experimental) Abstract:Recent advances in Graphical User Interface (GUI) and embodied navigation have driven progress, yet these domains have largely evolved in isolation, with disparate datasets and training paradigms. In this paper, we observe that both tasks can be formulated as Markov Decision Processes (MDP), suggesting a foundational principle for their unification. Hence, we present NaviMaster, the first unified agent capable of unifying GUI navigation and embodied navigation within a single framework. Specifically, NaviMaster (i) proposes a visual-target trajectory collection pipeline that generates trajectories for both GUI and embodied tasks using a single formulation. (ii) employs a unified reinforcement learning framework on the mix data to improve generalization. (iii) designs a novel distance-aware reward to ensure efficient learning from the trajectories. Through extensive experiments on out-of-domain benchmarks, Navi...