[2604.08457] CrashSight: A Phase-Aware, Infrastructure-Centric Video Benchmark for Traffic Crash Scene Understanding and Reasoning

[2604.08457] CrashSight: A Phase-Aware, Infrastructure-Centric Video Benchmark for Traffic Crash Scene Understanding and Reasoning

arXiv - AI 4 min read

About this article

Abstract page for arXiv paper 2604.08457: CrashSight: A Phase-Aware, Infrastructure-Centric Video Benchmark for Traffic Crash Scene Understanding and Reasoning

Computer Science > Computer Vision and Pattern Recognition arXiv:2604.08457 (cs) [Submitted on 9 Apr 2026 (v1), last revised 10 Apr 2026 (this version, v2)] Title:CrashSight: A Phase-Aware, Infrastructure-Centric Video Benchmark for Traffic Crash Scene Understanding and Reasoning Authors:Rui Gan, Junyi Ma, Pei Li, Xingyou Yang, Kai Chen, Sikai Chen, Bin Ran View a PDF of the paper titled CrashSight: A Phase-Aware, Infrastructure-Centric Video Benchmark for Traffic Crash Scene Understanding and Reasoning, by Rui Gan and 6 other authors View PDF HTML (experimental) Abstract:Cooperative autonomous driving requires traffic scene understanding from both vehicle and infrastructure perspectives. While vision-language models (VLMs) show strong general reasoning capabilities, their performance in safety-critical traffic scenarios remains insufficiently evaluated due to the ego-vehicle focus of existing benchmarks. To bridge this gap, we present \textbf{CrashSight}, a large-scale vision-language benchmark for roadway crash understanding using real-world roadside camera data. The dataset comprises 250 crash videos, annotated with 13K multiple-choice question-answer pairs organized under a two-tier taxonomy. Tier 1 evaluates the visual grounding of scene context and involved parties, while Tier 2 probes higher-level reasoning, including crash mechanics, causal attribution, temporal progression, and post-crash outcomes. We benchmark 8 state-of-the-art VLMs and show that, despite strong...

Originally published on April 13, 2026. Curated by AI News.

Related Articles

From LLMs to hallucinations, here’s a simple guide to common AI terms
Llms

From LLMs to hallucinations, here’s a simple guide to common AI terms

TechCrunch - AI · 19 min ·
[2604.08110] OV-Stitcher: A Global Context-Aware Framework for Training-Free Open-Vocabulary Semantic Segmentation
Llms

[2604.08110] OV-Stitcher: A Global Context-Aware Framework for Training-Free Open-Vocabulary Semantic Segmentation

Abstract page for arXiv paper 2604.08110: OV-Stitcher: A Global Context-Aware Framework for Training-Free Open-Vocabulary Semantic Segmen...

arXiv - AI · 3 min ·
[2603.06665] Better Eyes, Better Thoughts: Why Vision Chain-of-Thought Fails in Medicine
Llms

[2603.06665] Better Eyes, Better Thoughts: Why Vision Chain-of-Thought Fails in Medicine

Abstract page for arXiv paper 2603.06665: Better Eyes, Better Thoughts: Why Vision Chain-of-Thought Fails in Medicine

arXiv - AI · 3 min ·
[2602.04674] Overstating Attitudes, Ignoring Networks: LLM Biases in Simulating Misinformation Susceptibility
Llms

[2602.04674] Overstating Attitudes, Ignoring Networks: LLM Biases in Simulating Misinformation Susceptibility

Abstract page for arXiv paper 2602.04674: Overstating Attitudes, Ignoring Networks: LLM Biases in Simulating Misinformation Susceptibility

arXiv - AI · 4 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime