[2601.01095] NarrativeTrack: Evaluating Entity-Centric Reasoning for Narrative Understanding

[2601.01095] NarrativeTrack: Evaluating Entity-Centric Reasoning for Narrative Understanding

arXiv - Machine Learning 4 min read

About this article

Abstract page for arXiv paper 2601.01095: NarrativeTrack: Evaluating Entity-Centric Reasoning for Narrative Understanding

Computer Science > Computer Vision and Pattern Recognition arXiv:2601.01095 (cs) [Submitted on 3 Jan 2026 (v1), last revised 28 Mar 2026 (this version, v2)] Title:NarrativeTrack: Evaluating Entity-Centric Reasoning for Narrative Understanding Authors:Hyeonjeong Ha, Jinjin Ge, Bo Feng, Kaixin Ma, Gargi Chakraborty View a PDF of the paper titled NarrativeTrack: Evaluating Entity-Centric Reasoning for Narrative Understanding, by Hyeonjeong Ha and 4 other authors View PDF HTML (experimental) Abstract:Multimodal large language models (MLLMs) have achieved impressive progress in vision-language reasoning, yet their ability to understand temporally unfolding narratives in videos remains underexplored. True narrative understanding requires grounding who is doing what, when, and where, maintaining coherent entity representations across dynamic visual and temporal contexts. We introduce NarrativeTrack, the first benchmark to evaluate narrative understanding in MLLMs through fine-grained entity-centric reasoning. Unlike existing benchmarks limited to short clips or coarse scene-level semantics, we decompose videos into constituent entities and examine their continuity via a Compositional Reasoning Progression (CRP), a structured evaluation framework that progressively increases narrative complexity across three dimensions: entity existence, entity changes, and entity ambiguity. CRP challenges models to advance from temporal persistence to contextual evolution and fine-grained percept...

Originally published on March 31, 2026. Curated by AI News.

Related Articles

Llms

[R] Reference model free behavioral discovery of AudiBench model organisms via Probe-Mediated Adaptive Auditing

Anthropic's AuditBench - 56 Llama 3.3 70B models with planted hidden behaviors - their best agent detects the behaviros 10-13% of the tim...

Reddit - Machine Learning · 1 min ·
Llms

[P] Dante-2B: I'm training a 2.1B bilingual fully open Italian/English LLM from scratch on 2×H200. Phase 1 done — here's what I've built.

The problem If you work with Italian text and local models, you know the pain. Every open-source LLM out there treats Italian as an after...

Reddit - Machine Learning · 1 min ·
Llms

I have been coding for 11 years and I caught myself completely unable to debug a problem without AI assistance last month. That scared me more than anything I have seen in this industry.

I want to be honest about something that happened to me because I think it is more common than people admit. Last month I hit a bug in a ...

Reddit - Artificial Intelligence · 1 min ·
Llms

OpenClaw security checklist: practical safeguards for AI agents

Here is one of the better quality guides on the ensuring safety when deploying OpenClaw: https://chatgptguide.ai/openclaw-security-checkl...

Reddit - Artificial Intelligence · 1 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime