[2509.26346] EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing
About this article
Abstract page for arXiv paper 2509.26346: EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing
Computer Science > Computer Vision and Pattern Recognition arXiv:2509.26346 (cs) [Submitted on 30 Sep 2025 (v1), last revised 28 Feb 2026 (this version, v2)] Title:EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing Authors:Keming Wu, Sicong Jiang, Max Ku, Ping Nie, Minghao Liu, Wenhu Chen View a PDF of the paper titled EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing, by Keming Wu and 5 other authors View PDF HTML (experimental) Abstract:Recently, we have witnessed great progress in image editing with natural language instructions. Several closed-source models like GPT-Image-1, Seedream, and Google-Nano-Banana have shown highly promising progress. However, the open-source models are still lagging. The main bottleneck is the lack of a reliable reward model to scale up high-quality synthetic training data. To address this critical bottleneck, we built EditReward, trained with our new large-scale human preference dataset, meticulously annotated by trained experts following a rigorous protocol containing over 200K preference pairs. EditReward demonstrates superior alignment with human preferences in instruction-guided image editing tasks. Experiments show that EditReward achieves state-of-the-art human correlation on established benchmarks such as GenAI-Bench, AURORA-Bench, ImagenHub, and our new EditReward-Bench, outperforming a wide range of VLM-as-judge models. Furthermore, we use EditReward to select a high-quality su...