[2601.21278] GeoRC: A Benchmark for Geolocation Reasoning Chains

arXiv - Machine Learning April 21, 2026 4 min read

About this article

Abstract page for arXiv paper 2601.21278: GeoRC: A Benchmark for Geolocation Reasoning Chains

Computer Science > Computer Vision and Pattern Recognition arXiv:2601.21278 (cs) [Submitted on 29 Jan 2026 (v1), last revised 20 Apr 2026 (this version, v2)] Title:GeoRC: A Benchmark for Geolocation Reasoning Chains Authors:Mohit Talreja, Joshua Diao, Jim Thannikary James, Radu Casapu, Tejas Santanam, Ethan Mendes, Alan Ritter, Wei Xu, James Hays View a PDF of the paper titled GeoRC: A Benchmark for Geolocation Reasoning Chains, by Mohit Talreja and 8 other authors View PDF HTML (experimental) Abstract:Vision Language Models (VLMs) are good at recognizing the global location of a photograph -- their geolocation prediction accuracy rivals the best human experts. But many VLMs are startlingly bad at \textit{explaining} which image evidence led to their prediction, even when their location prediction is correct. In this paper, we introduce GeoRC, the first benchmark for geolocation reasoning chains sourced directly from Champion-tier GeoGuessr experts, including the reigning world champion. This benchmark consists of 800 ``ground truth'' reasoning chains across 500 query scenes from GeoGuessr maps, with expert chains addressing hundreds of different discriminative attributes, such as soil properties, architecture, and license plate shapes. We evaluate LLM-as-a-judge and VLM-as-a-judge strategies for scoring VLM-generated reasoning chains against our expert reasoning chains and find that Qwen 3 LLM-as-a-judge correlates best with human-expert scoring. Our benchmark reveals tha...

Originally published on April 21, 2026. Curated by AI News.

Llms

QIMMA قِمّة ⛰: A Quality-First Arabic LLM Leaderboard

A Blog post by Technology Innovation Institute on Hugging Face

Hugging Face Blog · 8 min · about 1 hour ago

Llms

Project Idea. Dream display project. 3 LLMs spitball the idea and tech specs and programs needed.

submitted by /u/Ok_Nectarine_4445 [link] [comments]

Reddit - Artificial Intelligence · 1 min · about 4 hours ago

Llms

[2604.07562] Reasoning-Based Refinement of Unsupervised Text Clusters with LLMs

Abstract page for arXiv paper 2604.07562: Reasoning-Based Refinement of Unsupervised Text Clusters with LLMs

arXiv - Machine Learning · 4 min · about 5 hours ago

Llms

[2604.07484] ConsistRM: Improving Generative Reward Models via Consistency-Aware Self-Training

Abstract page for arXiv paper 2604.07484: ConsistRM: Improving Generative Reward Models via Consistency-Aware Self-Training

arXiv - Machine Learning · 4 min · about 5 hours ago

[2601.21278] GeoRC: A Benchmark for Geolocation Reasoning Chains

About this article

Related Articles

QIMMA قِمّة ⛰: A Quality-First Arabic LLM Leaderboard

Project Idea. Dream display project. 3 LLMs spitball the idea and tech specs and programs needed.

[2604.07562] Reasoning-Based Refinement of Unsupervised Text Clusters with LLMs

[2604.07484] ConsistRM: Improving Generative Reward Models via Consistency-Aware Self-Training

No comments

Stay updated with AI News