[2504.12007] Diffusion Generative Recommendation with Continuous Tokens

[2504.12007] Diffusion Generative Recommendation with Continuous Tokens

arXiv - AI 4 min read Article

Summary

The paper presents ContRec, a novel framework that integrates continuous tokens into LLM-based recommender systems, enhancing user preference modeling and item retrieval.

Why It Matters

This research addresses limitations in traditional recommender systems by proposing a continuous tokenization approach, which improves gradient propagation and learning efficiency. It highlights the potential of generative AI in advancing recommendation technologies, making it relevant for AI researchers and industry practitioners focused on enhancing user experience through personalized recommendations.

Key Takeaways

  • ContRec uses continuous tokens to improve LLM-based recommendation systems.
  • The framework includes a sigma-VAE Tokenizer and a Dispersive Diffusion module for better user preference modeling.
  • Experiments show ContRec outperforms traditional and state-of-the-art recommender systems.
  • The approach addresses issues of lossy tokenization and inaccurate gradient propagation.
  • This research opens avenues for future advancements in generative modeling for recommendations.

Computer Science > Information Retrieval arXiv:2504.12007 (cs) [Submitted on 16 Apr 2025 (v1), last revised 24 Feb 2026 (this version, v5)] Title:Diffusion Generative Recommendation with Continuous Tokens Authors:Haohao Qu, Shanru Lin, Yujuan Ding, Yiqi Wang, Wenqi Fan View a PDF of the paper titled Diffusion Generative Recommendation with Continuous Tokens, by Haohao Qu and 4 other authors View PDF HTML (experimental) Abstract:Recent advances in generative artificial intelligence, particularly large language models (LLMs), have opened new opportunities for enhancing recommender systems (RecSys). Most existing LLM-based RecSys approaches operate in a discrete space, using vector-quantized tokenizers to align with the inherent discrete nature of language models. However, these quantization methods often result in lossy tokenization and suboptimal learning, primarily due to inaccurate gradient propagation caused by the non-differentiable argmin operation in standard vector quantization. Inspired by the emerging trend of embracing continuous tokens in language models, we propose ContRec, a novel framework that seamlessly integrates continuous tokens into LLM-based RecSys. Specifically, ContRec consists of two key modules: a sigma-VAE Tokenizer, which encodes users/items with continuous tokens; and a Dispersive Diffusion module, which captures implicit user preference. The tokenizer is trained with a continuous Variational Auto-Encoder (VAE) objective, where three effective te...

Related Articles

Llms

"Oops! ChatGPT is Temporarily Unavailable!": A Diary Study on Knowledge Workers' Experiences of LLM Withdrawal

submitted by /u/Special-Steel [link] [comments]

Reddit - Artificial Intelligence · 1 min ·
Llms

I built a Star Trek LCARS terminal that reads your entire AI coding setup

Side project that got out of hand. It's a dashboard for Claude Code that scans your ~/.claude/ directory and renders everything as a TNG ...

Reddit - Artificial Intelligence · 1 min ·
Llms

[R] Is autoresearch really better than classic hyperparameter tuning?

We did experiments comparing Optuna & autoresearch. Autoresearch converges faster, is more cost-efficient, and even generalizes bette...

Reddit - Machine Learning · 1 min ·
Llms

Claude Source Code?

Has anyone been able to successfully download the leaked source code yet? I've not been able to find it. If anyone has, please reach out....

Reddit - Artificial Intelligence · 1 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime