[2603.28114] Attention Frequency Modulation: Training-Free Spectral Modulation of Diffusion Cross-Attention
About this article
Abstract page for arXiv paper 2603.28114: Attention Frequency Modulation: Training-Free Spectral Modulation of Diffusion Cross-Attention
Computer Science > Computer Vision and Pattern Recognition arXiv:2603.28114 (cs) [Submitted on 30 Mar 2026] Title:Attention Frequency Modulation: Training-Free Spectral Modulation of Diffusion Cross-Attention Authors:Seunghun Oh, Unsang Park View a PDF of the paper titled Attention Frequency Modulation: Training-Free Spectral Modulation of Diffusion Cross-Attention, by Seunghun Oh and Unsang Park View PDF HTML (experimental) Abstract:Cross-attention is the primary interface through which text conditions latent diffusion models, yet its step-wise multi-resolution dynamics remain under-characterized, limiting principled training-free control. We cast diffusion cross-attention as a spatiotemporal signal on the latent grid by summarizing token-softmax weights into token-agnostic concentration maps and tracking their radially binned Fourier power over denoising. Across prompts and seeds, encoder cross-attention exhibits a consistent coarse-to-fine spectral progression, yielding a stable time-frequency fingerprint of token competition. Building on this structure, we introduce Attention Frequency Modulation (AFM), a plug-and-play inference-time intervention that edits token-wise pre-softmax cross-attention logits in the Fourier domain: low- and high-frequency bands are reweighted with a progress-aligned schedule and can be adaptively gated by token-allocation entropy, before the token softmax. AFM provides a continuous handle to bias the spatial scale of token-competition pattern...