[2601.21452] SAGE: Sequence-level Adaptive Gradient Evolution for Generative Recommendation
Summary
The paper presents SAGE, a new optimizer for generative recommendation systems that addresses limitations in existing methods by improving learning from user feedback and enhancing diversity in recommendations.
Why It Matters
As generative recommendation systems become more prevalent, optimizing their performance is crucial for user satisfaction and engagement. SAGE's innovative approach tackles common issues in recommendation algorithms, making it a significant contribution to the field of machine learning.
Key Takeaways
- SAGE introduces a sequence-level optimizer to enhance generative recommendations.
- It addresses issues like cold-start item learning and diversity collapse.
- The method shows improved accuracy and recall in experiments on real-world datasets.
Computer Science > Machine Learning arXiv:2601.21452 (cs) [Submitted on 29 Jan 2026 (v1), last revised 13 Feb 2026 (this version, v3)] Title:SAGE: Sequence-level Adaptive Gradient Evolution for Generative Recommendation Authors:Yu Xie, Xing Kai Ren, Ying Qi, Hu Yao View a PDF of the paper titled SAGE: Sequence-level Adaptive Gradient Evolution for Generative Recommendation, by Yu Xie and 3 other authors View PDF HTML (experimental) Abstract:Reinforcement learning-based preference optimization is increasingly used to align list-wise generative recommenders with complex, multi-objective user feedback, yet existing optimizers such as Gradient-Bounded Policy Optimization (GBPO) exhibit structural limitations in recommendation settings. We identify a Symmetric Conservatism failure mode in which symmetric update bounds suppress learning from rare positive signals (e.g., cold-start items), static negative-sample constraints fail to prevent diversity collapse under rejection-dominated feedback, and group-normalized multi-objective rewards lead to low-resolution training signals. To address these issues, we propose SAGE (Sequence-level Adaptive Gradient Evolution), a unified optimizer designed for list-wise generative recommendation. SAGE introduces sequence-level signal alignment via a geometric-mean importance ratio and a decoupled multi-objective advantage estimator to reduce token-level variance and mitigate reward collapse, together with asymmetric adaptive bounding that appli...