[2509.26324] COMRES-VLM: Coordinated Multi-Robot Exploration and Search using Vision Language Models
About this article
Abstract page for arXiv paper 2509.26324: COMRES-VLM: Coordinated Multi-Robot Exploration and Search using Vision Language Models
Computer Science > Robotics arXiv:2509.26324 (cs) [Submitted on 30 Sep 2025 (v1), last revised 1 Mar 2026 (this version, v3)] Title:COMRES-VLM: Coordinated Multi-Robot Exploration and Search using Vision Language Models Authors:Ruiyang Wang, Hao-Lun Hsu, David Hunt, Jiwoo Kim, Shaocheng Luo, Miroslav Pajic View a PDF of the paper titled COMRES-VLM: Coordinated Multi-Robot Exploration and Search using Vision Language Models, by Ruiyang Wang and 5 other authors View PDF HTML (experimental) Abstract:Autonomous exploration and object search in unknown indoor environments remain challenging for multi-robot systems (MRS). Traditional approaches often rely on greedy frontier assignment strategies with limited inter-robot coordination. In this work, we present Coordinated Multi-Robot Exploration and Search using Vision Language Models (COMRES-VLM), a novel framework that leverages Vision Language Models (VLMs) for intelligent coordination of MRS tasked with efficient exploration and target object search. COMRES-VLM integrates real-time frontier cluster extraction and topological skeleton analysis with VLM reasoning over shared occupancy maps, robot states, and optional natural language priors, in order to generate globally consistent waypoint assignments. Extensive experiments in large-scale simulated indoor environments with up to six robots demonstrate that COMRES-VLM consistently outperforms state-of-the-art coordination methods, including Capacitated Vehicle Routing Problem (C...