[2602.16979] Characterizing the Predictive Impact of Modalities with Supervised Latent-Variable Modeling
Summary
The paper presents PRIMO, a supervised latent-variable model that addresses the challenges of incomplete multimodal data by quantifying the predictive impact of missing modalities in machine learning tasks.
Why It Matters
As multimodal large language models become increasingly prevalent, understanding how to effectively handle missing data is crucial for improving predictive accuracy. PRIMO offers a novel approach to leverage all available data, enhancing the robustness of machine learning models in real-world applications.
Key Takeaways
- PRIMO models missing modalities using latent variables to improve predictions.
- The model allows for the use of incomplete data without sacrificing performance.
- Evaluation on diverse datasets shows PRIMO performs comparably to both unimodal and multimodal baselines.
- The variance-based metric provides insights into the predictive impact of missing modalities.
- PRIMO can enhance applications in fields like healthcare and computer vision by improving data utilization.
Computer Science > Computer Vision and Pattern Recognition arXiv:2602.16979 (cs) [Submitted on 19 Feb 2026] Title:Characterizing the Predictive Impact of Modalities with Supervised Latent-Variable Modeling Authors:Divyam Madaan, Sumit Chopra, Kyunghyun Cho View a PDF of the paper titled Characterizing the Predictive Impact of Modalities with Supervised Latent-Variable Modeling, by Divyam Madaan and 2 other authors View PDF HTML (experimental) Abstract:Despite the recent success of Multimodal Large Language Models (MLLMs), existing approaches predominantly assume the availability of multiple modalities during training and inference. In practice, multimodal data is often incomplete because modalities may be missing, collected asynchronously, or available only for a subset of examples. In this work, we propose PRIMO, a supervised latent-variable imputation model that quantifies the predictive impact of any missing modality within the multimodal learning setting. PRIMO enables the use of all available training examples, whether modalities are complete or partial. Specifically, it models the missing modality through a latent variable that captures its relationship with the observed modality in the context of prediction. During inference, we draw many samples from the learned distribution over the missing modality to both obtain the marginal predictive distribution (for the purpose of prediction) and analyze the impact of the missing modalities on the prediction for each instanc...