[2603.28651] Not Search, But Scan: Benchmarking MLLMs on Scan-Oriented Academic Paper Reasoning

[2603.28651] Not Search, But Scan: Benchmarking MLLMs on Scan-Oriented Academic Paper Reasoning

arXiv - AI 4 min read

About this article

Abstract page for arXiv paper 2603.28651: Not Search, But Scan: Benchmarking MLLMs on Scan-Oriented Academic Paper Reasoning

Computer Science > Artificial Intelligence arXiv:2603.28651 (cs) [Submitted on 27 Mar 2026] Title:Not Search, But Scan: Benchmarking MLLMs on Scan-Oriented Academic Paper Reasoning Authors:Rongjin Li, Zichen Tang, Xianghe Wang, Xinyi Hu, Zhengyu Wang, Zhengyu Lu, Yiling Huang, Jiayuan Chen, Weisheng Tan, Jiacheng Liu, Zhongjun Yang, Haihong E View a PDF of the paper titled Not Search, But Scan: Benchmarking MLLMs on Scan-Oriented Academic Paper Reasoning, by Rongjin Li and 11 other authors View PDF HTML (experimental) Abstract:With the rapid progress of multimodal large language models (MLLMs), AI already performs well at literature retrieval and certain reasoning tasks, serving as a capable assistant to human researchers, yet it remains far from autonomous research. The fundamental reason is that current work on academic paper reasoning is largely confined to a search-oriented paradigm centered on pre-specified targets, with reasoning grounded in relevance retrieval, which struggles to support researcher-style full-document understanding, reasoning, and verification. To bridge this gap, we propose \textbf{ScholScan}, a new benchmark for academic paper reasoning. ScholScan introduces a scan-oriented task setting that asks models to read and cross-check entire papers like human researchers, scanning the document to identify consistency issues. The benchmark comprises 1,800 carefully annotated questions drawn from nine error categories across 13 natural-science domains and 7...

Originally published on March 31, 2026. Curated by AI News.

Related Articles

Block Resets Management With AI As Cash App Adds Installment Transfers
Llms

Block Resets Management With AI As Cash App Adds Installment Transfers

Block (NYSE:XYZ) plans a permanent organizational overhaul that replaces many middle management roles with AI-driven models to create fla...

AI Tools & Products · 5 min ·
Anthropic leaks source code for its AI coding agent Claude
Llms

Anthropic leaks source code for its AI coding agent Claude

Anthropic accidentally exposed roughly 512,000 lines of proprietary TypeScript source code for its AI-powered coding agent Claude Code

AI Tools & Products · 3 min ·
AI Desktop 98 lets you chat with Claude, ChatGPT, and Gemini through a Windows 98-inspired interface
Llms

AI Desktop 98 lets you chat with Claude, ChatGPT, and Gemini through a Windows 98-inspired interface

It even has Minesweeper.

AI Tools & Products · 3 min ·
Llms

[R] Looking for arXiv cs.LG endorser, inference monitoring using information geometry

Hi r/MachineLearning, I’m looking for an arXiv endorser in cs.LG for a paper on inference-time distribution shift detection for deployed ...

Reddit - Machine Learning · 1 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime