[2601.16296] Memory-V2V: Memory-Augmented Video-to-Video Diffusion for Consistent Multi-Turn Editing

[2601.16296] Memory-V2V: Memory-Augmented Video-to-Video Diffusion for Consistent Multi-Turn Editing

arXiv - Machine Learning 3 min read

About this article

Abstract page for arXiv paper 2601.16296: Memory-V2V: Memory-Augmented Video-to-Video Diffusion for Consistent Multi-Turn Editing

Computer Science > Computer Vision and Pattern Recognition arXiv:2601.16296 (cs) [Submitted on 22 Jan 2026 (v1), last revised 23 Mar 2026 (this version, v2)] Title:Memory-V2V: Memory-Augmented Video-to-Video Diffusion for Consistent Multi-Turn Editing Authors:Dohun Lee, Chun-Hao Paul Huang, Xuelin Chen, Jong Chul Ye, Duygu Ceylan, Hyeonho Jeong View a PDF of the paper titled Memory-V2V: Memory-Augmented Video-to-Video Diffusion for Consistent Multi-Turn Editing, by Dohun Lee and 5 other authors View PDF HTML (experimental) Abstract:Video-to-video diffusion models achieve impressive single-turn editing performance, but practical editing workflows are inherently iterative. When edits are applied sequentially, existing models treat each turn independently, often causing previously generated regions to drift or be overwritten. We identify this failure mode as the problem of cross-turn consistency in multi-turn video editing. We introduce Memory-V2V, a memory-augmented framework that treats prior edits as structured constraints for subsequent generations. Memory-V2V maintains an external memory of previous outputs, retrieves task-relevant edits, and integrates them through relevance-aware tokenization and adaptive compression. These technical ingredients enable scalable conditioning without linear growth in computation. We demonstrate Memory-V2V on iterative video novel view synthesis and text-guided long video editing. Memory-V2V substantially enhances cross-turn consistency w...

Originally published on March 24, 2026. Curated by AI News.

Related Articles

Machine Learning

[D] It’s 2026. Can we finally admit TensorFlow is the "COBOL of Machine Learning"?

We keep telling students to learn both, but let’s look at the actual landscape: Research: 95%+ of HuggingFace and arXiv is PyTorch. Innov...

Reddit - Machine Learning · 1 min ·
Machine Learning

I have question for people who got job

how you guys getting job in ml as a fresher ?? I am in college. havent started learning ml but willing to . let me know exactly how to do...

Reddit - ML Jobs · 1 min ·
Llms

🤖 AI News Digest - March 27, 2026

Today's AI news: 1. My minute-by-minute response to the LiteLLM malware attack The article describes a detailed, minute-by-minute respons...

Reddit - Artificial Intelligence · 1 min ·
Llms

[D] Real-time Student Attention Detection: ResNet vs Facial Landmarks - Which approach for resource-constrained deployment?

I have a problem statement where we are supposed to detect the attention level of student in a classroom, basically output whether he is ...

Reddit - Machine Learning · 1 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime