[2603.00483] RAISE: Requirement-Adaptive Evolutionary Refinement for Training-Free Text-to-Image Alignment
About this article
Abstract page for arXiv paper 2603.00483: RAISE: Requirement-Adaptive Evolutionary Refinement for Training-Free Text-to-Image Alignment
Computer Science > Computer Vision and Pattern Recognition arXiv:2603.00483 (cs) [Submitted on 28 Feb 2026] Title:RAISE: Requirement-Adaptive Evolutionary Refinement for Training-Free Text-to-Image Alignment Authors:Liyao Jiang, Ruichen Chen, Chao Gao, Di Niu View a PDF of the paper titled RAISE: Requirement-Adaptive Evolutionary Refinement for Training-Free Text-to-Image Alignment, by Liyao Jiang and 3 other authors View PDF HTML (experimental) Abstract:Recent text-to-image (T2I) diffusion models achieve remarkable realism, yet faithful prompt-image alignment remains challenging, particularly for complex prompts with multiple objects, relations, and fine-grained attributes. Existing training-free inference-time scaling methods rely on fixed iteration budgets that cannot adapt to prompt difficulty, while reflection-tuned models require carefully curated reflection datasets and extensive joint fine-tuning of diffusion and vision-language models, often overfitting to reflection paths data and lacking transferability across models. We introduce RAISE (Requirement-Adaptive Self-Improving Evolution), a training-free, requirement-driven evolutionary framework for adaptive T2I generation. RAISE formulates image generation as a requirement-driven adaptive scaling process, evolving a population of candidates at inference time through a diverse set of refinement actions-including prompt rewriting, noise resampling, and instructional editing. Each generation is verified against a str...