[2602.11062] MoToRec: Sparse-Regularized Multimodal Tokenization for

[2602.11062] MoToRec: Sparse-Regularized Multimodal Tokenization for Cold-Start Recommendation

arXiv - Machine Learning March 04, 2026 4 min read

About this article

Abstract page for arXiv paper 2602.11062: MoToRec: Sparse-Regularized Multimodal Tokenization for Cold-Start Recommendation

Computer Science > Machine Learning arXiv:2602.11062 (cs) [Submitted on 11 Feb 2026 (v1), last revised 3 Mar 2026 (this version, v2)] Title:MoToRec: Sparse-Regularized Multimodal Tokenization for Cold-Start Recommendation Authors:Jialin Liu, Zhaorui Zhang, Ray C.C. Cheung View a PDF of the paper titled MoToRec: Sparse-Regularized Multimodal Tokenization for Cold-Start Recommendation, by Jialin Liu and 2 other authors View PDF HTML (experimental) Abstract:Graph neural networks (GNNs) have revolutionized recommender systems by effectively modeling complex user-item interactions, yet data sparsity and the item cold-start problem significantly impair performance, particularly for new items with limited or no interaction history. While multimodal content offers a promising solution, existing methods result in suboptimal representations for new items due to noise and entanglement in sparse data. To address this, we transform multimodal recommendation into discrete semantic tokenization. We present Sparse-Regularized Multimodal Tokenization for Cold-Start Recommendation (MoToRec), a framework centered on a sparsely-regularized Residual Quantized Variational Autoencoder (RQ-VAE) that generates a compositional semantic code of discrete, interpretable tokens, promoting disentangled representations. MoToRec's architecture is enhanced by three synergistic components: (1) a sparsely-regularized RQ-VAE that promotes disentangled representations, (2) a novel adaptive rarity amplification...

Originally published on March 04, 2026. Curated by AI News.

Machine Learning

I tried building a memory-first AI… and ended up discovering smaller models can beat larger ones

Dataset Model Acc F1 Δ vs Log Δ vs Static Avg Params Peak Params Steps Infer ms Size Banking77-20 Logistic TF-IDF 92.37% 0.9230 +0.00pp +...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Llms

[D] Howcome Muon is only being used for Transformers?

Muon has quickly been adopted in LLM training, yet we don't see it being talked about in other contexts. Searches for Muon on ConvNets tu...

Reddit - Machine Learning · 1 min · about 1 hour ago

Machine Learning

[P] Run Karpathy's Autoresearch for $0.44 instead of $24 — Open-source parallel evolution pipeline on SageMaker Spot

TL;DR: I built an open-source pipeline that runs Karpathy's autoresearch on SageMaker Spot instances — 25 autonomous ML experiments for $...

Reddit - Machine Learning · 1 min · about 1 hour ago

Machine Learning

Improving AI models’ ability to explain their predictions

AI News - General · 9 min · about 1 hour ago

[2602.11062] MoToRec: Sparse-Regularized Multimodal Tokenization for Cold-Start Recommendation

About this article

Related Articles

I tried building a memory-first AI… and ended up discovering smaller models can beat larger ones

[D] Howcome Muon is only being used for Transformers?

[P] Run Karpathy's Autoresearch for $0.44 instead of $24 — Open-source parallel evolution pipeline on SageMaker Spot

Improving AI models’ ability to explain their predictions

No comments

Stay updated with AI News