[2601.12781] VIRO: Robust and Efficient Neuro-Symbolic Reasoning with Verification for Referring Expression Comprehension
About this article
Abstract page for arXiv paper 2601.12781: VIRO: Robust and Efficient Neuro-Symbolic Reasoning with Verification for Referring Expression Comprehension
Computer Science > Artificial Intelligence arXiv:2601.12781 (cs) [Submitted on 19 Jan 2026 (v1), last revised 20 Mar 2026 (this version, v2)] Title:VIRO: Robust and Efficient Neuro-Symbolic Reasoning with Verification for Referring Expression Comprehension Authors:Hyejin Park, Junhyuk Kwon, Suha Kwak, Jungseul Ok View a PDF of the paper titled VIRO: Robust and Efficient Neuro-Symbolic Reasoning with Verification for Referring Expression Comprehension, by Hyejin Park and 3 other authors View PDF HTML (experimental) Abstract:Referring Expression Comprehension (REC) aims to localize the image region corresponding to a natural language query. Recent neuro-symbolic REC approaches leverage large language models (LLMs) and vision-language models (VLMs) to perform compositional reasoning, decomposing queries into structured programs and executing them step-by-step. While such approaches achieve interpretable reasoning and strong zero-shot generalization, they assume that intermediate reasoning steps are accurate. However, this assumption causes cascading errors: false detections and invalid relations propagate through the reasoning chain, yielding high-confidence false positives even when no target is present in the image. To address this limitation, we introduce Verification-Integrated Reasoning Operators (VIRO), a neuro-symbolic framework that embeds lightweight operator-level verifiers within reasoning steps. Each operator executes and validates its output, such as object exist...