[2511.21331] The More, the Merrier: Contrastive Fusion for Higher-Order Multimodal Alignment

[2511.21331] The More, the Merrier: Contrastive Fusion for Higher-Order Multimodal Alignment

arXiv - AI 4 min read

About this article

Abstract page for arXiv paper 2511.21331: The More, the Merrier: Contrastive Fusion for Higher-Order Multimodal Alignment

Computer Science > Computer Vision and Pattern Recognition arXiv:2511.21331 (cs) [Submitted on 26 Nov 2025 (v1), last revised 3 Apr 2026 (this version, v2)] Title:The More, the Merrier: Contrastive Fusion for Higher-Order Multimodal Alignment Authors:Stefanos Koutoupis, Michaela Areti Zervou, Konstantinos Kontras, Maarten De Vos, Panagiotis Tsakalides, Grigorios Tsagkatakis View a PDF of the paper titled The More, the Merrier: Contrastive Fusion for Higher-Order Multimodal Alignment, by Stefanos Koutoupis and 5 other authors View PDF HTML (experimental) Abstract:Learning joint representations across multiple modalities remains a central challenge in multimodal machine learning. Prevailing approaches predominantly operate in pairwise settings, aligning two modalities at a time. While some recent methods aim to capture higher-order interactions among multiple modalities, they often overlook or insufficiently preserve pairwise relationships, limiting their effectiveness on single-modality tasks. In this work, we introduce Contrastive Fusion (ConFu), a framework that jointly embeds both individual modalities and their fused combinations into a unified representation space, where modalities and their fused counterparts are aligned. ConFu extends traditional pairwise contrastive objectives with an additional fused-modality contrastive term, encouraging the joint embedding of modality pairs with a third modality. This formulation enables ConFu to capture higher-order dependencies...

Originally published on April 06, 2026. Curated by AI News.

Related Articles

Top 10 AI certifications and courses for 2026
Ai Startups

Top 10 AI certifications and courses for 2026

This article reviews the top 10 AI certifications and courses for 2026, highlighting their significance in a rapidly evolving field and t...

AI Events · 15 min ·
[2604.01989] Attention at Rest Stays at Rest: Breaking Visual Inertia for Cognitive Hallucination Mitigation
Llms

[2604.01989] Attention at Rest Stays at Rest: Breaking Visual Inertia for Cognitive Hallucination Mitigation

Abstract page for arXiv paper 2604.01989: Attention at Rest Stays at Rest: Breaking Visual Inertia for Cognitive Hallucination Mitigation

arXiv - AI · 4 min ·
[2604.01447] Better Rigs, Not Bigger Networks: A Body Model Ablation for Gaussian Avatars
Machine Learning

[2604.01447] Better Rigs, Not Bigger Networks: A Body Model Ablation for Gaussian Avatars

Abstract page for arXiv paper 2604.01447: Better Rigs, Not Bigger Networks: A Body Model Ablation for Gaussian Avatars

arXiv - AI · 3 min ·
[2603.24326] Boosting Document Parsing Efficiency and Performance with Coarse-to-Fine Visual Processing
Llms

[2603.24326] Boosting Document Parsing Efficiency and Performance with Coarse-to-Fine Visual Processing

Abstract page for arXiv paper 2603.24326: Boosting Document Parsing Efficiency and Performance with Coarse-to-Fine Visual Processing

arXiv - AI · 4 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime