[2602.19702] DReX: An Explainable Deep Learning-based Multimodal Recommendation Framework
Summary
DReX is a novel multimodal recommendation framework that enhances user and item representation through explainable deep learning, addressing common limitations in existing systems.
Why It Matters
This research is significant as it tackles the challenges of cold-start and data sparsity in recommendation systems by integrating multiple data modalities. The explainability aspect adds value, allowing users to understand recommendations better, which is crucial in applications like e-commerce and content platforms.
Key Takeaways
- DReX improves representation alignment between users and items by using interaction-level features.
- The framework eliminates the need for separate feature extraction processes, simplifying the recommendation process.
- It demonstrates robustness to varying or missing data modalities, enhancing its applicability in real-world scenarios.
- DReX automatically generates interpretable keyword profiles, aiding in understanding user preferences.
- Experimental results show DReX outperforms existing state-of-the-art recommendation methods.
Computer Science > Information Retrieval arXiv:2602.19702 (cs) [Submitted on 23 Feb 2026] Title:DReX: An Explainable Deep Learning-based Multimodal Recommendation Framework Authors:Adamya Shyam, Venkateswara Rao Kagita, Bharti Rana, Vikas Kumar View a PDF of the paper titled DReX: An Explainable Deep Learning-based Multimodal Recommendation Framework, by Adamya Shyam and 3 other authors View PDF HTML (experimental) Abstract:Multimodal recommender systems leverage diverse data sources, such as user interactions, content features, and contextual information, to address challenges like cold-start and data sparsity. However, existing methods often suffer from one or more key limitations: processing different modalities in isolation, requiring complete multimodal data for each interaction during training, or independent learning of user and item representations. These factors contribute to increased complexity and potential misalignment between user and item embeddings. To address these challenges, we propose DReX, a unified multimodal recommendation framework that incrementally refines user and item representations by leveraging interaction-level features from multimodal feedback. Our model employs gated recurrent units to selectively integrate these fine-grained features into global representations. This incremental update mechanism provides three key advantages: (1) simultaneous modeling of both nuanced interaction details and broader preference patterns, (2) eliminates the ...