[2602.17386] Visual Model Checking: Graph-Based Inference of Visual Routines for Image Retrieval
Summary
The paper presents a novel framework integrating formal verification with deep learning for improved image retrieval, addressing the limitations of current models in handling complex queries.
Why It Matters
This research is significant as it enhances the reliability and transparency of image retrieval systems, which are crucial in various applications, from search engines to AI-driven visual assistants. By addressing complex queries with formal reasoning, it sets a new standard for accountability in AI outputs.
Key Takeaways
- Introduces a framework combining formal verification and deep learning for image retrieval.
- Addresses challenges in handling complex queries involving relationships and constraints.
- Enhances transparency by verifying query constraints against retrieved content.
- Aims to improve the reliability of results from popular embedding-based models.
- Supports open-vocabulary natural language queries for broader applicability.
Computer Science > Artificial Intelligence arXiv:2602.17386 (cs) [Submitted on 19 Feb 2026] Title:Visual Model Checking: Graph-Based Inference of Visual Routines for Image Retrieval Authors:Adrià Molina, Oriol Ramos Terrades, Josep Lladós View a PDF of the paper titled Visual Model Checking: Graph-Based Inference of Visual Routines for Image Retrieval, by Adri\`a Molina and Oriol Ramos Terrades and Josep Llad\'os View PDF HTML (experimental) Abstract:Information retrieval lies at the foundation of the modern digital industry. While natural language search has seen dramatic progress in recent years largely driven by embedding-based models and large-scale pretraining, the field still faces significant challenges. Specifically, queries that involve complex relationships, object compositions, or precise constraints such as identities, counts and proportions often remain unresolved or unreliable within current frameworks. In this paper, we propose a novel framework that integrates formal verification into deep learning-based image retrieval through a synergistic combination of graph-based verification methods and neural code generation. Our approach aims to support open-vocabulary natural language queries while producing results that are both trustworthy and verifiable. By grounding retrieval results in a system of formal reasoning, we move beyond the ambiguity and approximation that often characterize vector representations. Instead of accepting uncertainty as a given, our fra...