[2601.22783] Compact Hypercube Embeddings for Fast Text-based Wildlife Observation Retrieval
About this article
Abstract page for arXiv paper 2601.22783: Compact Hypercube Embeddings for Fast Text-based Wildlife Observation Retrieval
Computer Science > Information Retrieval arXiv:2601.22783 (cs) [Submitted on 30 Jan 2026 (v1), last revised 5 Apr 2026 (this version, v2)] Title:Compact Hypercube Embeddings for Fast Text-based Wildlife Observation Retrieval Authors:Ilyass Moummad, Marius Miron, David Robinson, Kawtar Zaher, Hervé Goëau, Olivier Pietquin, Pierre Bonnet, Emmanuel Chemla, Matthieu Geist, Alexis Joly View a PDF of the paper titled Compact Hypercube Embeddings for Fast Text-based Wildlife Observation Retrieval, by Ilyass Moummad and 9 other authors View PDF HTML (experimental) Abstract:Large-scale biodiversity monitoring platforms increasingly rely on multimodal wildlife observations. While recent foundation models enable rich semantic representations across vision, audio, and language, retrieving relevant observations from massive archives remains challenging due to the computational cost of high-dimensional similarity search. In this work, we introduce compact hypercube embeddings for fast text-based wildlife observation retrieval, a framework that enables efficient text-based search over large-scale wildlife image and audio databases using compact binary representations. Building on the cross-view code alignment hashing framework, we extend lightweight hashing beyond a single-modality setup to align natural language descriptions with visual or acoustic observations in a shared Hamming space. Our approach leverages pretrained wildlife foundation models, including BioCLIP and BioLingual, and ...