[2510.27584] Image Hashing via Cross-View Code Alignment in the Age of Foundation Models
About this article
Abstract page for arXiv paper 2510.27584: Image Hashing via Cross-View Code Alignment in the Age of Foundation Models
Computer Science > Computer Vision and Pattern Recognition arXiv:2510.27584 (cs) [Submitted on 31 Oct 2025 (v1), last revised 4 Apr 2026 (this version, v3)] Title:Image Hashing via Cross-View Code Alignment in the Age of Foundation Models Authors:Ilyass Moummad, Kawtar Zaher, Hervé Goëau, Alexis Joly View a PDF of the paper titled Image Hashing via Cross-View Code Alignment in the Age of Foundation Models, by Ilyass Moummad and Kawtar Zaher and Herv\'e Go\"eau and Alexis Joly View PDF HTML (experimental) Abstract:Efficient large-scale retrieval requires representations that are both compact and discriminative. Foundation models provide powerful visual and multimodal embeddings, but nearest neighbor search in these high-dimensional spaces is computationally expensive. Hashing offers an efficient alternative by enabling fast Hamming distance search with binary codes, yet existing approaches often rely on complex pipelines, multi-term objectives, designs specialized for a single learning paradigm, and long training times. We introduce CroVCA (Cross-View Code Alignment), a simple and unified principle for learning binary codes that remain consistent across semantically aligned views. A single binary cross-entropy loss enforces alignment, while coding-rate maximization serves as an anti-collapse regularizer to promote balanced and diverse codes. To implement this, we design HashCoder, a lightweight MLP hashing network with a final batch normalization layer to enforce balanced c...