[2604.08457] CrashSight: A Phase-Aware, Infrastructure-Centric Video

[2604.08457] CrashSight: A Phase-Aware, Infrastructure-Centric Video Benchmark for Traffic Crash Scene Understanding and Reasoning

arXiv - AI April 13, 2026 4 min read

About this article

Abstract page for arXiv paper 2604.08457: CrashSight: A Phase-Aware, Infrastructure-Centric Video Benchmark for Traffic Crash Scene Understanding and Reasoning

Computer Science > Computer Vision and Pattern Recognition arXiv:2604.08457 (cs) [Submitted on 9 Apr 2026 (v1), last revised 10 Apr 2026 (this version, v2)] Title:CrashSight: A Phase-Aware, Infrastructure-Centric Video Benchmark for Traffic Crash Scene Understanding and Reasoning Authors:Rui Gan, Junyi Ma, Pei Li, Xingyou Yang, Kai Chen, Sikai Chen, Bin Ran View a PDF of the paper titled CrashSight: A Phase-Aware, Infrastructure-Centric Video Benchmark for Traffic Crash Scene Understanding and Reasoning, by Rui Gan and 6 other authors View PDF HTML (experimental) Abstract:Cooperative autonomous driving requires traffic scene understanding from both vehicle and infrastructure perspectives. While vision-language models (VLMs) show strong general reasoning capabilities, their performance in safety-critical traffic scenarios remains insufficiently evaluated due to the ego-vehicle focus of existing benchmarks. To bridge this gap, we present \textbf{CrashSight}, a large-scale vision-language benchmark for roadway crash understanding using real-world roadside camera data. The dataset comprises 250 crash videos, annotated with 13K multiple-choice question-answer pairs organized under a two-tier taxonomy. Tier 1 evaluates the visual grounding of scene context and involved parties, while Tier 2 probes higher-level reasoning, including crash mechanics, causal attribution, temporal progression, and post-crash outcomes. We benchmark 8 state-of-the-art VLMs and show that, despite strong...

Originally published on April 13, 2026. Curated by AI News.

Llms

From LLMs to hallucinations, here’s a simple guide to common AI terms

TechCrunch - AI · 19 min · 5 minutes ago

Llms

[2604.08110] OV-Stitcher: A Global Context-Aware Framework for Training-Free Open-Vocabulary Semantic Segmentation

Abstract page for arXiv paper 2604.08110: OV-Stitcher: A Global Context-Aware Framework for Training-Free Open-Vocabulary Semantic Segmen...

arXiv - AI · 3 min · about 2 hours ago

Llms

[2603.06665] Better Eyes, Better Thoughts: Why Vision Chain-of-Thought Fails in Medicine

Abstract page for arXiv paper 2603.06665: Better Eyes, Better Thoughts: Why Vision Chain-of-Thought Fails in Medicine

arXiv - AI · 3 min · about 2 hours ago

Llms

[2602.04674] Overstating Attitudes, Ignoring Networks: LLM Biases in Simulating Misinformation Susceptibility

Abstract page for arXiv paper 2602.04674: Overstating Attitudes, Ignoring Networks: LLM Biases in Simulating Misinformation Susceptibility

arXiv - AI · 4 min · about 2 hours ago

[2604.08457] CrashSight: A Phase-Aware, Infrastructure-Centric Video Benchmark for Traffic Crash Scene Understanding and Reasoning

About this article

Related Articles

From LLMs to hallucinations, here’s a simple guide to common AI terms

[2604.08110] OV-Stitcher: A Global Context-Aware Framework for Training-Free Open-Vocabulary Semantic Segmentation

[2603.06665] Better Eyes, Better Thoughts: Why Vision Chain-of-Thought Fails in Medicine

[2602.04674] Overstating Attitudes, Ignoring Networks: LLM Biases in Simulating Misinformation Susceptibility

No comments

Stay updated with AI News