[2601.21452] SAGE: Sequence-level Adaptive Gradient Evolution for Generative Recommendation

[2601.21452] SAGE: Sequence-level Adaptive Gradient Evolution for Generative Recommendation

arXiv - Machine Learning 4 min read Article

Summary

The paper presents SAGE, a new optimizer for generative recommendation systems that addresses limitations in existing methods by improving learning from user feedback and enhancing diversity in recommendations.

Why It Matters

As generative recommendation systems become more prevalent, optimizing their performance is crucial for user satisfaction and engagement. SAGE's innovative approach tackles common issues in recommendation algorithms, making it a significant contribution to the field of machine learning.

Key Takeaways

  • SAGE introduces a sequence-level optimizer to enhance generative recommendations.
  • It addresses issues like cold-start item learning and diversity collapse.
  • The method shows improved accuracy and recall in experiments on real-world datasets.

Computer Science > Machine Learning arXiv:2601.21452 (cs) [Submitted on 29 Jan 2026 (v1), last revised 13 Feb 2026 (this version, v3)] Title:SAGE: Sequence-level Adaptive Gradient Evolution for Generative Recommendation Authors:Yu Xie, Xing Kai Ren, Ying Qi, Hu Yao View a PDF of the paper titled SAGE: Sequence-level Adaptive Gradient Evolution for Generative Recommendation, by Yu Xie and 3 other authors View PDF HTML (experimental) Abstract:Reinforcement learning-based preference optimization is increasingly used to align list-wise generative recommenders with complex, multi-objective user feedback, yet existing optimizers such as Gradient-Bounded Policy Optimization (GBPO) exhibit structural limitations in recommendation settings. We identify a Symmetric Conservatism failure mode in which symmetric update bounds suppress learning from rare positive signals (e.g., cold-start items), static negative-sample constraints fail to prevent diversity collapse under rejection-dominated feedback, and group-normalized multi-objective rewards lead to low-resolution training signals. To address these issues, we propose SAGE (Sequence-level Adaptive Gradient Evolution), a unified optimizer designed for list-wise generative recommendation. SAGE introduces sequence-level signal alignment via a geometric-mean importance ratio and a decoupled multi-objective advantage estimator to reduce token-level variance and mitigate reward collapse, together with asymmetric adaptive bounding that appli...

Related Articles

Machine Learning

Why would Anthropic keep a cyber model like Project Glasswing invite-only?

Anthropic’s Project Glasswing caught my attention less as a cybersecurity headline than as a signal about how frontier AI may be commerci...

Reddit - Artificial Intelligence · 1 min ·
Anthropic Teams Up With Its Rivals to Keep AI From Hacking Everything
Llms

Anthropic Teams Up With Its Rivals to Keep AI From Hacking Everything

The AI lab's Project Glasswing will bring together Apple, Google, and more than 45 other organizations. They'll use the new Claude Mythos...

Wired - AI · 7 min ·
Anthropic limits Mythos AI rollout over fears hackers could use model for cyberattacks
Machine Learning

Anthropic limits Mythos AI rollout over fears hackers could use model for cyberattacks

AI Tools & Products · 5 min ·
Anthropic’s latest AI model could let hackers carry out attacks faster than ever. It wants companies to put up defenses first
Machine Learning

Anthropic’s latest AI model could let hackers carry out attacks faster than ever. It wants companies to put up defenses first

AI Tools & Products · 5 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime