[2603.04803] Guiding Diffusion-based Reconstruction with Contrastive

[2603.04803] Guiding Diffusion-based Reconstruction with Contrastive Signals for Balanced Visual Representation

arXiv - Machine Learning March 06, 2026 4 min read

About this article

Abstract page for arXiv paper 2603.04803: Guiding Diffusion-based Reconstruction with Contrastive Signals for Balanced Visual Representation

Computer Science > Computer Vision and Pattern Recognition arXiv:2603.04803 (cs) [Submitted on 5 Mar 2026] Title:Guiding Diffusion-based Reconstruction with Contrastive Signals for Balanced Visual Representation Authors:Boyu Han, Qianqian Xu, Shilong Bao, Zhiyong Yang, Ruochen Cui, Xilin Zhao, Qingming Huang View a PDF of the paper titled Guiding Diffusion-based Reconstruction with Contrastive Signals for Balanced Visual Representation, by Boyu Han and 6 other authors View PDF HTML (experimental) Abstract:The limited understanding capacity of the visual encoder in Contrastive Language-Image Pre-training (CLIP) has become a key bottleneck for downstream performance. This capacity includes both Discriminative Ability (D-Ability), which reflects class separability, and Detail Perceptual Ability (P-Ability), which focuses on fine-grained visual cues. Recent solutions use diffusion models to enhance representations by conditioning image reconstruction on CLIP visual tokens. We argue that such paradigms may compromise D-Ability and therefore fail to effectively address CLIP's representation limitations. To address this, we integrate contrastive signals into diffusion-based reconstruction to pursue more comprehensive visual representations. We begin with a straightforward design that augments the diffusion process with contrastive learning on input images. However, empirical results show that the naive combination suffers from gradient conflict and yields suboptimal performance. ...

Originally published on March 06, 2026. Curated by AI News.

Machine Learning

[D] Could really use some guidance . I'm a 2nd year Data Science UG Student

I'm currently finishing up my second year of a three year Bachelor of Data Science degree. I've got the basics down quite well, linear re...

Reddit - Machine Learning · 1 min · 21 minutes ago

Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min · 37 minutes ago

Machine Learning

[R] Spectral Compact Training: 172x memory reduction for 70B model training - verified on a Steam Deck (7.24 GB)

This is a research article about a patent I filed (not self promotion). I am dyslexic so I used AI to help with the writing. I have been ...

Reddit - Machine Learning · 1 min · about 2 hours ago

Llms

ChatGPT Critiques My Approach to AI

I uploaded VulcanAMI into ChatGPT and had it to a deep analysis. I then asked one simple question: What would be the result of wider adop...

Reddit - Artificial Intelligence · 1 min · about 2 hours ago

[2603.04803] Guiding Diffusion-based Reconstruction with Contrastive Signals for Balanced Visual Representation

About this article

Related Articles

[D] Could really use some guidance . I'm a 2nd year Data Science UG Student

UMKC Announces New Master of Science in Artificial Intelligence

[R] Spectral Compact Training: 172x memory reduction for 70B model training - verified on a Steam Deck (7.24 GB)

ChatGPT Critiques My Approach to AI

No comments

Stay updated with AI News