[2602.17682] Duality Models: An Embarrassingly Simple One-step Generation Paradigm
Summary
The paper presents Duality Models (DuMo), a novel approach in generative modeling that enhances stability and efficiency by using a shared backbone with dual outputs, addressing limitations in traditional training paradigms.
Why It Matters
This research is significant as it proposes a new paradigm in generative modeling that optimizes training efficiency and model performance. By overcoming the trade-offs inherent in existing methods, it paves the way for advancements in machine learning applications, particularly in image generation.
Key Takeaways
- Duality Models utilize a 'one input, dual output' approach to improve generative model training.
- The method enhances stability and efficiency by applying geometric constraints across all samples.
- Achieves state-of-the-art performance on ImageNet with significantly fewer steps.
- Addresses the trade-off between multi-step and few-step objectives in generative modeling.
- Code availability encourages further research and application in the field.
Computer Science > Machine Learning arXiv:2602.17682 (cs) [Submitted on 4 Feb 2026] Title:Duality Models: An Embarrassingly Simple One-step Generation Paradigm Authors:Peng Sun, Xinyi Shang, Tao Lin, Zhiqiang Shen View a PDF of the paper titled Duality Models: An Embarrassingly Simple One-step Generation Paradigm, by Peng Sun and 3 other authors View PDF HTML (experimental) Abstract:Consistency-based generative models like Shortcut and MeanFlow achieve impressive results via a target-aware design for solving the Probability Flow ODE (PF-ODE). Typically, such methods introduce a target time $r$ alongside the current time $t$ to modulate outputs between a local multi-step derivative ($r = t$) and a global few-step integral ($r = 0$). However, the conventional "one input, one output" paradigm enforces a partition of the training budget, often allocating a significant portion (e.g., 75% in MeanFlow) solely to the multi-step objective for stability. This separation forces a trade-off: allocating sufficient samples to the multi-step objective leaves the few-step generation undertrained, which harms convergence and limits scalability. To this end, we propose Duality Models (DuMo) via a "one input, dual output" paradigm. Using a shared backbone with dual heads, DuMo simultaneously predicts velocity $v_t$ and flow-map $u_t$ from a single input $x_t$. This applies geometric constraints from the multi-step objective to every sample, bounding the few-step estimation without separating...