[2602.08277] PISCO: Precise Video Instance Insertion with Sparse

[2602.08277] PISCO: Precise Video Instance Insertion with Sparse Control

arXiv - AI March 30, 2026 4 min read

About this article

Abstract page for arXiv paper 2602.08277: PISCO: Precise Video Instance Insertion with Sparse Control

Computer Science > Computer Vision and Pattern Recognition arXiv:2602.08277 (cs) [Submitted on 9 Feb 2026 (v1), last revised 27 Mar 2026 (this version, v2)] Title:PISCO: Precise Video Instance Insertion with Sparse Control Authors:Xiangbo Gao, Renjie Li, Xinghao Chen, Yuheng Wu, Suofei Feng, Qing Yin, Zhengzhong Tu View a PDF of the paper titled PISCO: Precise Video Instance Insertion with Sparse Control, by Xiangbo Gao and 6 other authors View PDF HTML (experimental) Abstract:The landscape of AI video generation is undergoing a pivotal shift: moving beyond general generation - which relies on exhaustive prompt-engineering and "cherry-picking" - towards fine-grained, controllable generation and high-fidelity post-processing. In professional AI-assisted filmmaking, it is crucial to perform precise, targeted modifications. A cornerstone of this transition is video instance insertion, which requires inserting a specific instance into existing footage while maintaining scene integrity. Unlike traditional video editing, this task demands several requirements: precise spatial-temporal placement, physically consistent scene interaction, and the faithful preservation of original dynamics - all achieved under minimal user effort. In this paper, we propose PISCO, a video diffusion model for precise video instance insertion with arbitrary sparse keyframe control. PISCO allows users to specify a single keyframe, start-and-end keyframes, or sparse keyframes at arbitrary timestamps, and...

Originally published on March 30, 2026. Curated by AI News.

Machine Learning

[2511.18746] Any4D: Open-Prompt 4D Generation from Natural Language and Images

Abstract page for arXiv paper 2511.18746: Any4D: Open-Prompt 4D Generation from Natural Language and Images

arXiv - AI · 4 min · about 2 hours ago

Llms

[2512.14549] Dual-objective Language Models: Training Efficiency Without Overfitting

Abstract page for arXiv paper 2512.14549: Dual-objective Language Models: Training Efficiency Without Overfitting

arXiv - AI · 3 min · about 2 hours ago

Llms

[2510.21011] Generating the Modal Worker: A Cross-Model Audit of Race and Gender in LLM-Generated Personas Across 41 Occupations

Abstract page for arXiv paper 2510.21011: Generating the Modal Worker: A Cross-Model Audit of Race and Gender in LLM-Generated Personas A...

arXiv - AI · 4 min · about 2 hours ago

Llms

[2510.24133] Compositional Image Synthesis with Inference-Time Scaling

Abstract page for arXiv paper 2510.24133: Compositional Image Synthesis with Inference-Time Scaling

arXiv - AI · 3 min · about 2 hours ago

[2602.08277] PISCO: Precise Video Instance Insertion with Sparse Control

About this article

Related Articles

[2511.18746] Any4D: Open-Prompt 4D Generation from Natural Language and Images

[2512.14549] Dual-objective Language Models: Training Efficiency Without Overfitting

[2510.21011] Generating the Modal Worker: A Cross-Model Audit of Race and Gender in LLM-Generated Personas Across 41 Occupations

[2510.24133] Compositional Image Synthesis with Inference-Time Scaling

No comments

Stay updated with AI News