[2602.08277] PISCO: Precise Video Instance Insertion with Sparse Control

[2602.08277] PISCO: Precise Video Instance Insertion with Sparse Control

arXiv - AI 4 min read

About this article

Abstract page for arXiv paper 2602.08277: PISCO: Precise Video Instance Insertion with Sparse Control

Computer Science > Computer Vision and Pattern Recognition arXiv:2602.08277 (cs) [Submitted on 9 Feb 2026 (v1), last revised 27 Mar 2026 (this version, v2)] Title:PISCO: Precise Video Instance Insertion with Sparse Control Authors:Xiangbo Gao, Renjie Li, Xinghao Chen, Yuheng Wu, Suofei Feng, Qing Yin, Zhengzhong Tu View a PDF of the paper titled PISCO: Precise Video Instance Insertion with Sparse Control, by Xiangbo Gao and 6 other authors View PDF HTML (experimental) Abstract:The landscape of AI video generation is undergoing a pivotal shift: moving beyond general generation - which relies on exhaustive prompt-engineering and "cherry-picking" - towards fine-grained, controllable generation and high-fidelity post-processing. In professional AI-assisted filmmaking, it is crucial to perform precise, targeted modifications. A cornerstone of this transition is video instance insertion, which requires inserting a specific instance into existing footage while maintaining scene integrity. Unlike traditional video editing, this task demands several requirements: precise spatial-temporal placement, physically consistent scene interaction, and the faithful preservation of original dynamics - all achieved under minimal user effort. In this paper, we propose PISCO, a video diffusion model for precise video instance insertion with arbitrary sparse keyframe control. PISCO allows users to specify a single keyframe, start-and-end keyframes, or sparse keyframes at arbitrary timestamps, and...

Originally published on March 30, 2026. Curated by AI News.

Related Articles

[2511.18746] Any4D: Open-Prompt 4D Generation from Natural Language and Images
Machine Learning

[2511.18746] Any4D: Open-Prompt 4D Generation from Natural Language and Images

Abstract page for arXiv paper 2511.18746: Any4D: Open-Prompt 4D Generation from Natural Language and Images

arXiv - AI · 4 min ·
[2512.14549] Dual-objective Language Models: Training Efficiency Without Overfitting
Llms

[2512.14549] Dual-objective Language Models: Training Efficiency Without Overfitting

Abstract page for arXiv paper 2512.14549: Dual-objective Language Models: Training Efficiency Without Overfitting

arXiv - AI · 3 min ·
[2510.21011] Generating the Modal Worker: A Cross-Model Audit of Race and Gender in LLM-Generated Personas Across 41 Occupations
Llms

[2510.21011] Generating the Modal Worker: A Cross-Model Audit of Race and Gender in LLM-Generated Personas Across 41 Occupations

Abstract page for arXiv paper 2510.21011: Generating the Modal Worker: A Cross-Model Audit of Race and Gender in LLM-Generated Personas A...

arXiv - AI · 4 min ·
[2510.24133] Compositional Image Synthesis with Inference-Time Scaling
Llms

[2510.24133] Compositional Image Synthesis with Inference-Time Scaling

Abstract page for arXiv paper 2510.24133: Compositional Image Synthesis with Inference-Time Scaling

arXiv - AI · 3 min ·
More in Generative Ai: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime