[2603.20336] GEM: A Native Graph-based Index for Multi-Vector Retrieval
About this article
Abstract page for arXiv paper 2603.20336: GEM: A Native Graph-based Index for Multi-Vector Retrieval
Computer Science > Information Retrieval arXiv:2603.20336 (cs) [Submitted on 20 Mar 2026] Title:GEM: A Native Graph-based Index for Multi-Vector Retrieval Authors:Yao Tian, Zhoujin Tian, Xi Zhao, Ruiyuan Zhang, Xiaofang Zhou View a PDF of the paper titled GEM: A Native Graph-based Index for Multi-Vector Retrieval, by Yao Tian and 4 other authors View PDF HTML (experimental) Abstract:In multi-vector retrieval, both queries and data are represented as sets of high-dimensional vectors, enabling finer-grained semantic matching and improving retrieval quality over single-vector approaches. However, its practical adoption is held back by the lack of effective indexing algorithms. Existing work, attempting to reuse standard single-vector indexes, often fails to preserve multi-vector semantics or remains slow. In this work, we present GEM, a native indexing framework for multi-vector representations. The core idea is to construct a proximity graph directly over vector sets, preserving their fine-grained semantics while enabling efficient navigation. First, GEM designs a set-level clustering scheme. It associates each vector set with only its most informative clusters, effectively reducing redundancy without hurting semantic coverage. Then, it builds local proximity graphs within clusters and bridges them into a globally navigable structure. To handle the non-metric nature of multi-vector similarity, GEM decouples the graph construction metric from the final relevance score and inj...