[2603.29677] Mind the Gap: A Framework for Assessing Pitfalls in Multimodal Active Learning
About this article
Abstract page for arXiv paper 2603.29677: Mind the Gap: A Framework for Assessing Pitfalls in Multimodal Active Learning
Computer Science > Machine Learning arXiv:2603.29677 (cs) [Submitted on 31 Mar 2026] Title:Mind the Gap: A Framework for Assessing Pitfalls in Multimodal Active Learning Authors:Dustin Eisenhardt, Yunhee Jeong, Florian Buettner View a PDF of the paper titled Mind the Gap: A Framework for Assessing Pitfalls in Multimodal Active Learning, by Dustin Eisenhardt and Yunhee Jeong and Florian Buettner View PDF HTML (experimental) Abstract:Multimodal learning enables neural networks to integrate information from heterogeneous sources, but active learning in this setting faces distinct challenges. These include missing modalities, differences in modality difficulty, and varying interaction structures. These are issues absent in the unimodal case. While the behavior of active learning strategies in unimodal settings is well characterized, their behavior under such multimodal conditions remains poorly understood. We introduce a new framework for benchmarking multimodal active learning that isolates these pitfalls using synthetic datasets, allowing systematic evaluation without confounding noise. Using this framework, we compare unimodal and multimodal query strategies and validate our findings on two real-world datasets. Our results show that models consistently develop imbalanced representations, relying primarily on one modality while neglecting others. Existing query methods do not mitigate this effect, and multimodal strategies do not consistently outperform unimodal ones. These ...