[2511.20629] MapReduce LoRA: Advancing the Pareto Front in Multi-Preference Optimization for Generative Models

[2511.20629] MapReduce LoRA: Advancing the Pareto Front in Multi-Preference Optimization for Generative Models

arXiv - Machine Learning 4 min read Article

Summary

The paper presents MapReduce LoRA, a novel framework for optimizing generative models by addressing multi-preference alignment issues. It introduces two methods that enhance model performance across various tasks, demonstrating significant improvements in generative quality me...

Why It Matters

This research is significant as it tackles the challenge of aligning generative models with multiple human preferences, a crucial aspect for applications in AI-driven content creation. By advancing the state-of-the-art in multi-preference optimization, it enhances the usability and effectiveness of generative models across various domains.

Key Takeaways

  • MapReduce LoRA introduces a new approach to optimize generative models for multiple preferences.
  • The framework shows substantial improvements in generative tasks, including text-to-image and text-to-video generation.
  • It employs parallel training of preference-specific experts to refine a shared model effectively.
  • Reward-aware Token Embedding (RaTE) enhances flexibility in preference control during inference.
  • The study sets a new benchmark for multi-preference alignment in generative models.

Computer Science > Computer Vision and Pattern Recognition arXiv:2511.20629 (cs) [Submitted on 25 Nov 2025 (v1), last revised 23 Feb 2026 (this version, v4)] Title:MapReduce LoRA: Advancing the Pareto Front in Multi-Preference Optimization for Generative Models Authors:Chieh-Yun Chen, Zhonghao Wang, Qi Chen, Zhifan Ye, Min Shi, Yue Zhao, Yinan Zhao, Hui Qu, Wei-An Lin, Yiru Shen, Ajinkya Kale, Irfan Essa, Humphrey Shi View a PDF of the paper titled MapReduce LoRA: Advancing the Pareto Front in Multi-Preference Optimization for Generative Models, by Chieh-Yun Chen and 12 other authors View PDF HTML (experimental) Abstract:Reinforcement learning from human feedback (RLHF) with reward models has advanced alignment of generative models to human aesthetic and perceptual preferences. However, jointly optimizing multiple rewards often incurs an alignment tax, improving one dimension while degrading others. To address this, we introduce two complementary methods: MapReduce LoRA and Reward-aware Token Embedding (RaTE). MapReduce LoRA trains preference-specific LoRA experts in parallel and iteratively merges them to refine a shared base model; RaTE learns reward-specific token embeddings that compose at inference for flexible preference control. Experiments on Text-to-Image generation (Stable Diffusion 3.5 Medium and FLUX.1-dev) show improvements of 36.1%, 4.6%, and 55.7%, and 32.7%, 4.3%, and 67.1% on GenEval, PickScore, and OCR, respectively. On Text-to-Video generation (HunyuanVi...

Related Articles

Machine Learning

[R] VOID: Video Object and Interaction Deletion (physically-consistent video inpainting)

We present VOID, a model for video object removal that aims to handle *physical interactions*, not just appearance. Most existing video i...

Reddit - Machine Learning · 1 min ·
Machine Learning

FLUX 2 Pro (2026) Sketch to Image

I sketched a cow and tested how different models interpret it into a realistic image for downstream 3D generation, turns out some models ...

Reddit - Artificial Intelligence · 1 min ·
Improving AI models’ ability to explain their predictions
Machine Learning

Improving AI models’ ability to explain their predictions

AI News - General · 9 min ·
Machine Learning

[D] TMLR reviews seem more reliable than ICML/NeurIPS/ICLR

This year I submitted a paper to ICML for the first time. I have also experienced the review process at TMLR and ICLR. From my observatio...

Reddit - Machine Learning · 1 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime