[2602.17410] Improving LLM-based Recommendation with Self-Hard Negatives from Intermediate Layers
Summary
This paper presents ILRec, a novel framework that enhances LLM-based recommendation systems by utilizing self-hard negative signals from intermediate layers to improve preference learning.
Why It Matters
As large language models (LLMs) are increasingly used in recommendation systems, improving their effectiveness is crucial. This research addresses limitations in current methods by introducing a more dynamic and informative approach to negative sample generation, which could lead to better user experiences and more accurate recommendations.
Key Takeaways
- ILRec framework leverages self-hard negative signals for better training.
- The two-stage framework includes cross-layer preference optimization and distillation.
- Extensive experiments show ILRec improves performance on recommendation tasks.
Computer Science > Information Retrieval arXiv:2602.17410 (cs) [Submitted on 19 Feb 2026] Title:Improving LLM-based Recommendation with Self-Hard Negatives from Intermediate Layers Authors:Bingqian Li, Bowen Zheng, Xiaolei Wang, Long Zhang, Jinpeng Wang, Sheng Chen, Wayne Xin Zhao, Ji-rong Wen View a PDF of the paper titled Improving LLM-based Recommendation with Self-Hard Negatives from Intermediate Layers, by Bingqian Li and 7 other authors View PDF HTML (experimental) Abstract:Large language models (LLMs) have shown great promise in recommender systems, where supervised fine-tuning (SFT) is commonly used for adaptation. Subsequent studies further introduce preference learning to incorporate negative samples into the training process. However, existing methods rely on sequence-level, offline-generated negatives, making them less discriminative and informative when adapting LLMs to recommendation tasks with large negative item spaces. To address these challenges, we propose ILRec, a novel preference fine-tuning framework for LLM-based recommendation, leveraging self-hard negative signals extracted from intermediate layers to improve preference learning. Specifically, we identify self-hard negative tokens from intermediate layers as fine-grained negative supervision that dynamically reflects the model's preference learning process. To effectively integrate these signals into training, we design a two-stage framework comprising cross-layer preference optimization and cross-...