Machine Learning Generative Ai Ai Safety Computer Vision

[2602.13055] Curriculum-DPO++: Direct Preference Optimization via Data and Model Curricula for Text-to-Image Generation

arXiv - Machine Learning February 16, 2026 4 min read Article

Summary

The paper presents Curriculum-DPO++, an advanced method for text-to-image generation that optimizes preference learning through a dual curriculum approach, enhancing model training efficiency and performance.

Why It Matters

This research addresses limitations in existing preference optimization methods by introducing a structured learning approach that improves the quality of generated images. By dynamically adjusting learning capacities, it offers a more effective way to train models, which is crucial for advancements in generative AI and computer vision.

Key Takeaways

Curriculum-DPO++ enhances preference optimization in text-to-image generation.
The method combines data-level and model-level curricula for improved training.
Dynamic capacity adjustment of the model leads to better performance on benchmarks.
Outperforms existing methods in text alignment, aesthetics, and human preference.
Code for the implementation is publicly available for further research.

Computer Science > Computer Vision and Pattern Recognition arXiv:2602.13055 (cs) [Submitted on 13 Feb 2026] Title:Curriculum-DPO++: Direct Preference Optimization via Data and Model Curricula for Text-to-Image Generation Authors:Florinel-Alin Croitoru, Vlad Hondru, Radu Tudor Ionescu, Nicu Sebe, Mubarak Shah View a PDF of the paper titled Curriculum-DPO++: Direct Preference Optimization via Data and Model Curricula for Text-to-Image Generation, by Florinel-Alin Croitoru and 4 other authors View PDF HTML (experimental) Abstract:Direct Preference Optimization (DPO) has been proposed as an effective and efficient alternative to reinforcement learning from human feedback (RLHF). However, neither RLHF nor DPO take into account the fact that learning certain preferences is more difficult than learning other preferences, rendering the optimization process suboptimal. To address this gap in text-to-image generation, we recently proposed Curriculum-DPO, a method that organizes image pairs by difficulty. In this paper, we introduce Curriculum-DPO++, an enhanced method that combines the original data-level curriculum with a novel model-level curriculum. More precisely, we propose to dynamically increase the learning capacity of the denoising network as training advances. We implement this capacity increase via two mechanisms. First, we initialize the model with only a subset of the trainable layers used in the original Curriculum-DPO. As training progresses, we sequentially unfreeze ...

Read Original Article

[2602.13055] Curriculum-DPO++: Direct Preference Optimization via Data and Model Curricula for Text-to-Image Generation

Summary

Why It Matters

Key Takeaways

Related Articles

Yupp shuts down after raising $33M from a16z crypto's Chris Dixon | TechCrunch

[R] Fine-tuning services report

[D] Does ML have a "bible"/reference textbook at the Intermediate/Advanced level?

[D] ICML 2026 review policy debate: 100 responses suggest Policy B may score higher, while Policy A shows higher confidence

No comments

Stay updated with AI News