[2511.22715] ReAG: Reasoning-Augmented Generation for Knowledge-based Visual Question Answering

[2511.22715] ReAG: Reasoning-Augmented Generation for Knowledge-based Visual Question Answering

arXiv - AI 4 min read

About this article

Abstract page for arXiv paper 2511.22715: ReAG: Reasoning-Augmented Generation for Knowledge-based Visual Question Answering

Computer Science > Computer Vision and Pattern Recognition arXiv:2511.22715 (cs) [Submitted on 27 Nov 2025 (v1), last revised 31 Mar 2026 (this version, v2)] Title:ReAG: Reasoning-Augmented Generation for Knowledge-based Visual Question Answering Authors:Alberto Compagnoni, Marco Morini, Sara Sarto, Federico Cocchi, Davide Caffagni, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara View a PDF of the paper titled ReAG: Reasoning-Augmented Generation for Knowledge-based Visual Question Answering, by Alberto Compagnoni and 7 other authors View PDF HTML (experimental) Abstract:Multimodal Large Language Models (MLLMs) have shown impressive capabilities in jointly understanding text, images, and videos, often evaluated via Visual Question Answering (VQA). However, even state-of-the-art MLLMs struggle with domain-specific or knowledge-intensive queries, where relevant information is underrepresented in pre-training data. Knowledge-based VQA (KB-VQA) addresses this by retrieving external documents to condition answer generation, but current retrieval-augmented approaches suffer from low precision, noisy passages, and limited reasoning. To address this, we propose ReAG, a novel Reasoning-Augmented Multimodal RAG approach that combines coarse- and fine-grained retrieval with a critic model that filters irrelevant passages, ensuring high-quality additional context. The model follows a multi-stage training strategy leveraging reinforcement learning to enhance reasoning over retrieved c...

Originally published on April 01, 2026. Curated by AI News.

Related Articles

Llms

I thought of something while cooking up a simple RL AI. Please Validate it. [R]

So, I was trying to build a simple AI when I thought of, 'How could I give an AI some emotions? ' This led to one thing after another, an...

Reddit - Machine Learning · 1 min ·
Llms

Open-source list of GenAI-related incidents

I am sharing this open-source list of cases where the ethics of GenAI use were put in the spotlight, in the hopes of sparking discussion ...

Reddit - Artificial Intelligence · 1 min ·
Llms

I built a repo for implementing and training LLM architectures from scratch in minimal PyTorch — contributions welcome! [P]

Hey everyone, I've been working on a repo where I implement large language model architectures using the simplest PyTorch code possible. ...

Reddit - Machine Learning · 1 min ·
Llms

I built a repo for implementing and training LLM architectures from scratch in minimal PyTorch — contributions welcome! [P]

Hey everyone, I've been working on a repo where I implement large language model architectures using the simplest PyTorch code possible. ...

Reddit - Machine Learning · 1 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime