[2604.05834] Hidden in the Multiplicative Interaction: Uncovering Fragility in Multimodal Contrastive Learning
About this article
Abstract page for arXiv paper 2604.05834: Hidden in the Multiplicative Interaction: Uncovering Fragility in Multimodal Contrastive Learning
Computer Science > Machine Learning arXiv:2604.05834 (cs) [Submitted on 7 Apr 2026] Title:Hidden in the Multiplicative Interaction: Uncovering Fragility in Multimodal Contrastive Learning Authors:Tillmann Rheude, Stefan Hegselmann, Roland Eils, Benjamin Wild View a PDF of the paper titled Hidden in the Multiplicative Interaction: Uncovering Fragility in Multimodal Contrastive Learning, by Tillmann Rheude and 3 other authors View PDF HTML (experimental) Abstract:Multimodal contrastive learning is increasingly enriched by going beyond image-text pairs. Among recent contrastive methods, Symile is a strong approach for this challenge because its multiplicative interaction objective captures higher-order cross-modal dependence. Yet, we find that Symile treats all modalities symmetrically and does not explicitly model reliability differences, a limitation that becomes especially present in trimodal multiplicative interactions. In practice, modalities beyond image-text pairs can be misaligned, weakly informative, or missing, and treating them uniformly can silently degrade performance. This fragility can be hidden in the multiplicative interaction: Symile may outperform pairwise CLIP even if a single unreliable modality silently corrupts the product terms. We propose Gated Symile, a contrastive gating mechanism that adapts modality contributions on an attention-based, per-candidate basis. The gate suppresses unreliable inputs by interpolating embeddings toward learnable neutral d...