[2510.16518] DIV-Nav: Open-Vocabulary Spatial Relationships for Multi-Object Navigation

[2510.16518] DIV-Nav: Open-Vocabulary Spatial Relationships for Multi-Object Navigation

arXiv - AI 4 min read

About this article

Abstract page for arXiv paper 2510.16518: DIV-Nav: Open-Vocabulary Spatial Relationships for Multi-Object Navigation

Computer Science > Robotics arXiv:2510.16518 (cs) [Submitted on 18 Oct 2025 (v1), last revised 30 Mar 2026 (this version, v2)] Title:DIV-Nav: Open-Vocabulary Spatial Relationships for Multi-Object Navigation Authors:Jesús Ortega-Peimbert, Finn Lukas Busch, Timon Homberger, Quantao Yang, Olov Andersson View a PDF of the paper titled DIV-Nav: Open-Vocabulary Spatial Relationships for Multi-Object Navigation, by Jes\'us Ortega-Peimbert and 4 other authors View PDF HTML (experimental) Abstract:Advances in open-vocabulary semantic mapping and object navigation have enabled robots to perform an informed search of their environment for an arbitrary object. However, such zero-shot object navigation is typically designed for simple queries with an object name like "television" or "blue rug". Here, we consider more complex free-text queries with spatial relationships, such as "find the remote on the table" while still leveraging robustness of a semantic map. We present DIV-Nav, a real-time navigation system that efficiently addresses this problem through a series of relaxations: i) Decomposing natural language instructions with complex spatial constraints into simpler object-level queries on a semantic map, ii) computing the Intersection of individual semantic belief maps to identify regions where all objects co-exist, and iii) Validating the discovered objects against the original, complex spatial constrains via a LVLM. We further investigate how to adapt the frontier exploration o...

Originally published on March 31, 2026. Curated by AI News.

Related Articles

Nlp

Built an Event Kernel for Agent OSes that Coordinates Under Load: Real-Time Events, Replayable Logs, TTL subs, No Deadlocks

Agent systems are running on outdated infrastructure, manual state checks, endless polling, and fragile logs. Every workaround patches an...

Reddit - Artificial Intelligence · 1 min ·
[2603.13793] GhanaNLP Parallel Corpora: Comprehensive Multilingual Resources for Low-Resource Ghanaian Languages
Nlp

[2603.13793] GhanaNLP Parallel Corpora: Comprehensive Multilingual Resources for Low-Resource Ghanaian Languages

Abstract page for arXiv paper 2603.13793: GhanaNLP Parallel Corpora: Comprehensive Multilingual Resources for Low-Resource Ghanaian Langu...

arXiv - AI · 4 min ·
[2602.08482] CLEAR: A Knowledge-Centric Vessel Trajectory Analysis Platform
Llms

[2602.08482] CLEAR: A Knowledge-Centric Vessel Trajectory Analysis Platform

Abstract page for arXiv paper 2602.08482: CLEAR: A Knowledge-Centric Vessel Trajectory Analysis Platform

arXiv - AI · 3 min ·
[2603.12057] Coarse-Guided Visual Generation via Weighted h-Transform Sampling
Machine Learning

[2603.12057] Coarse-Guided Visual Generation via Weighted h-Transform Sampling

Abstract page for arXiv paper 2603.12057: Coarse-Guided Visual Generation via Weighted h-Transform Sampling

arXiv - AI · 4 min ·
More in Nlp: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime