[2510.22049] Massive Memorization with Hundreds of Trillions of Parameters for Sequential Transducer Generative Recommenders

[2510.22049] Massive Memorization with Hundreds of Trillions of Parameters for Sequential Transducer Generative Recommenders

arXiv - Machine Learning 4 min read Article

Summary

This paper presents VISTA, a novel two-stage modeling framework for generative recommenders that enhances scalability by summarizing user history into a manageable format, allowing for improved performance on large-scale recommendation systems.

Why It Matters

As recommendation systems increasingly rely on vast amounts of user data, the ability to efficiently process and utilize long user histories is crucial. VISTA addresses significant scalability challenges, making it relevant for industries that depend on real-time recommendations for billions of users.

Key Takeaways

  • VISTA improves scalability for recommendation systems by summarizing user history into a few hundred tokens.
  • The two-stage modeling framework separates user history summarization from candidate item attention, enhancing efficiency.
  • This approach allows for handling lifelong user histories of up to one million items without increasing costs.
  • VISTA has shown significant improvements in both offline and online metrics.
  • The framework has been successfully deployed in an industry-leading recommendation platform.

Computer Science > Information Retrieval arXiv:2510.22049 (cs) [Submitted on 24 Oct 2025 (v1), last revised 25 Feb 2026 (this version, v2)] Title:Massive Memorization with Hundreds of Trillions of Parameters for Sequential Transducer Generative Recommenders Authors:Zhimin Chen, Chenyu Zhao, Ka Chun Mo, Yunjiang Jiang, Jane H. Lee, Khushhall Chandra Mahajan, Ning Jiang, Kai Ren, Jinhui Li, Wen-Yun Yang View a PDF of the paper titled Massive Memorization with Hundreds of Trillions of Parameters for Sequential Transducer Generative Recommenders, by Zhimin Chen and 9 other authors View PDF HTML (experimental) Abstract:Modern large-scale recommendation systems rely heavily on user interaction history sequences to enhance the model performance. The advent of large language models and sequential modeling techniques, particularly transformer-like architectures, has led to significant advancements recently (e.g., HSTU, SIM, and TWIN models). While scaling to ultra-long user histories (10k to 100k items) generally improves model performance, it also creates significant challenges on latency, queries per second (QPS) and GPU cost in industry-scale recommendation systems. Existing models do not adequately address these industrial scalability issues. In this paper, we propose a novel two-stage modeling framework, namely VIrtual Sequential Target Attention (VISTA), which decomposes traditional target attention from a candidate item to user history items into two distinct stages: (1) use...

Related Articles

You can now use ChatGPT with Apple’s CarPlay | The Verge
Llms

You can now use ChatGPT with Apple’s CarPlay | The Verge

ChatGPT is now accessible from your CarPlay dashboard if you have iOS 26.4 or newer and the latest version of the ChatGPT app.

The Verge - AI · 3 min ·
Llms

Have Companies Began Adopting Claude Co-Work at an Enterprise Level?

Hi Guys, My company is considering purchasing the Claude Enterprise plan. The main two constraints are: - Being able to block usage of Cl...

Reddit - Artificial Intelligence · 1 min ·
Llms

What I learned about multi-agent coordination running 9 specialized Claude agents

I've been experimenting with multi-agent AI systems and ended up building something more ambitious than I originally planned: a fully ope...

Reddit - Artificial Intelligence · 1 min ·
Llms

[D] The problem with comparing AI memory system benchmarks — different evaluation methods make scores meaningless

I've been reviewing how various AI memory systems evaluate their performance and noticed a fundamental issue with cross-system comparison...

Reddit - Machine Learning · 1 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime