[2605.06924] A$^2$RD: Agentic Autoregressive Diffusion for Long Video Consistency

[2605.06924] A$^2$RD: Agentic Autoregressive Diffusion for Long Video Consistency

arXiv - AI 3 min read

About this article

Abstract page for arXiv paper 2605.06924: A$^2$RD: Agentic Autoregressive Diffusion for Long Video Consistency

Computer Science > Computer Vision and Pattern Recognition arXiv:2605.06924 (cs) [Submitted on 7 May 2026] Title:A$^2$RD: Agentic Autoregressive Diffusion for Long Video Consistency Authors:Do Xuan Long, Yale Song, Min-Yen Kan, Tomas Pfister, Long T. Le View a PDF of the paper titled A$^2$RD: Agentic Autoregressive Diffusion for Long Video Consistency, by Do Xuan Long and 4 other authors View PDF HTML (experimental) Abstract:Synthesizing consistent and coherent long video remains a fundamental challenge. Existing methods suffer from semantic drift and narrative collapse over long horizons. We present A$^2$RD, an Agentic Auto-Regressive Diffusion architecture that decouples creative synthesis from consistency enforcement. A$^2$RD formulates long video synthesis as a closed-loop process that synthesizes and self-improves video segment-by-segment through a Retrieve--Synthesize--Refine--Update cycle. It comprises three core components: (i) Multimodal Video Memory that tracks video progression across modalities; (ii) Adaptive Segment Generation that switches among generation modes for natural progression and visual consistency; and (iii) Hierarchical Test-Time Self-Improvement that self-improves each segment at frame and video levels to prevent error propagation. We further introduce LVBench-C, a challenging benchmark with non-linear entity and environment transitions to stress-test long-horizon consistency. Across public and LVBench-C benchmarks spanning one- to ten-minute vid...

Originally published on May 11, 2026. Curated by AI News.

Related Articles

[2506.14399] Factored Classifier-Free Guidance
Machine Learning

[2506.14399] Factored Classifier-Free Guidance

Abstract page for arXiv paper 2506.14399: Factored Classifier-Free Guidance

arXiv - AI · 3 min ·
[2605.08063] Flow-OPD: On-Policy Distillation for Flow Matching Models
Llms

[2605.08063] Flow-OPD: On-Policy Distillation for Flow Matching Models

Abstract page for arXiv paper 2605.08063: Flow-OPD: On-Policy Distillation for Flow Matching Models

arXiv - AI · 4 min ·
[2605.08043] SCOPE: Structured Decomposition and Conditional Skill Orchestration for Complex Image Generation
Machine Learning

[2605.08043] SCOPE: Structured Decomposition and Conditional Skill Orchestration for Complex Image Generation

Abstract page for arXiv paper 2605.08043: SCOPE: Structured Decomposition and Conditional Skill Orchestration for Complex Image Generation

arXiv - AI · 4 min ·
[2605.07414] OrchJail: Jailbreaking Tool-Calling Text-to-Image Agents by Orchestration-Guided Fuzzing
Generative Ai

[2605.07414] OrchJail: Jailbreaking Tool-Calling Text-to-Image Agents by Orchestration-Guided Fuzzing

Abstract page for arXiv paper 2605.07414: OrchJail: Jailbreaking Tool-Calling Text-to-Image Agents by Orchestration-Guided Fuzzing

arXiv - AI · 3 min ·
More in Generative Ai: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime