[2602.19140] CaReFlow: Cyclic Adaptive Rectified Flow for Multimodal Fusion
Summary
The paper presents CaReFlow, a novel approach for multimodal fusion that addresses modality gaps using cyclic adaptive rectified flow, enhancing distribution mapping and alignment for improved performance in affective computing tasks.
Why It Matters
This research is significant as it tackles the challenge of modality gaps in multimodal fusion, a critical issue in machine learning and computer vision. By improving distribution alignment and reducing information loss, the CaReFlow method could enhance the effectiveness of various applications, particularly in affective computing, which relies on accurate multimodal data interpretation.
Key Takeaways
- CaReFlow utilizes cyclic adaptive rectified flow to improve multimodal fusion.
- The method addresses modality gaps by allowing source data to observe target distributions.
- Adaptive relaxed alignment enhances mapping accuracy for modality pairs.
- Cyclic rectified flow prevents information loss during feature transfer.
- The approach shows competitive results in multimodal affective computing tasks.
Computer Science > Computer Vision and Pattern Recognition arXiv:2602.19140 (cs) [Submitted on 22 Feb 2026] Title:CaReFlow: Cyclic Adaptive Rectified Flow for Multimodal Fusion Authors:Sijie Mai, Shiqin Han View a PDF of the paper titled CaReFlow: Cyclic Adaptive Rectified Flow for Multimodal Fusion, by Sijie Mai and 1 other authors View PDF HTML (experimental) Abstract:Modality gap significantly restricts the effectiveness of multimodal fusion. Previous methods often use techniques such as diffusion models and adversarial learning to reduce the modality gap, but they typically focus on one-to-one alignment without exposing the data points of the source modality to the global distribution information of the target modality. To this end, leveraging the characteristic of rectified flow that can map one distribution to another via a straight trajectory, we extend rectified flow for modality distribution mapping. Specifically, we leverage the `one-to-many mapping' strategy in rectified flow that allows each data point of the source modality to observe the overall target distribution. This also alleviates the issue of insufficient paired data within each sample, enabling a more robust distribution transformation. Moreover, to achieve more accurate distribution mapping and address the ambiguous flow directions in one-to-many mapping, we design `adaptive relaxed alignment', enforcing stricter alignment for modality pairs belonging to the same sample, while applying relaxed mappin...