[2603.28762] On-the-fly Repulsion in the Contextual Space for Rich Diversity in Diffusion Transformers
About this article
Abstract page for arXiv paper 2603.28762: On-the-fly Repulsion in the Contextual Space for Rich Diversity in Diffusion Transformers
Computer Science > Computer Vision and Pattern Recognition arXiv:2603.28762 (cs) [Submitted on 30 Mar 2026] Title:On-the-fly Repulsion in the Contextual Space for Rich Diversity in Diffusion Transformers Authors:Omer Dahary, Benaya Koren, Daniel Garibi, Daniel Cohen-Or View a PDF of the paper titled On-the-fly Repulsion in the Contextual Space for Rich Diversity in Diffusion Transformers, by Omer Dahary and 3 other authors View PDF HTML (experimental) Abstract:Modern Text-to-Image (T2I) diffusion models have achieved remarkable semantic alignment, yet they often suffer from a significant lack of variety, converging on a narrow set of visual solutions for any given prompt. This typicality bias presents a challenge for creative applications that require a wide range of generative outcomes. We identify a fundamental trade-off in current approaches to diversity: modifying model inputs requires costly optimization to incorporate feedback from the generative path. In contrast, acting on spatially-committed intermediate latents tends to disrupt the forming visual structure, leading to artifacts. In this work, we propose to apply repulsion in the Contextual Space as a novel framework for achieving rich diversity in Diffusion Transformers. By intervening in the multimodal attention channels, we apply on-the-fly repulsion during the transformer's forward pass, injecting the intervention between blocks where text conditioning is enriched with emergent image structure. This allows for...