[2602.17395] SpectralGCD: Spectral Concept Selection and Cross-modal Representation Learning for Generalized Category Discovery

[2602.17395] SpectralGCD: Spectral Concept Selection and Cross-modal Representation Learning for Generalized Category Discovery

arXiv - Machine Learning 4 min read Article

Summary

The paper presents SpectralGCD, a novel approach for Generalized Category Discovery (GCD) that enhances multimodal learning by efficiently integrating image and text data, improving accuracy while reducing computational costs.

Why It Matters

As the demand for automated category discovery in unlabeled datasets grows, SpectralGCD offers a significant advancement by addressing the limitations of existing methods. Its efficient cross-modal representation learning can lead to broader applications in machine learning and artificial intelligence, particularly in environments with limited labeled data.

Key Takeaways

  • SpectralGCD improves Generalized Category Discovery by using cross-modal representations.
  • The method reduces reliance on spurious visual cues through semantic concept mixtures.
  • It achieves competitive accuracy with lower computational costs compared to state-of-the-art methods.
  • The approach utilizes knowledge distillation to enhance the quality of learned representations.
  • Code for SpectralGCD is publicly available, promoting further research and application.

Computer Science > Computer Vision and Pattern Recognition arXiv:2602.17395 (cs) [Submitted on 19 Feb 2026] Title:SpectralGCD: Spectral Concept Selection and Cross-modal Representation Learning for Generalized Category Discovery Authors:Lorenzo Caselli, Marco Mistretta, Simone Magistri, Andrew D. Bagdanov View a PDF of the paper titled SpectralGCD: Spectral Concept Selection and Cross-modal Representation Learning for Generalized Category Discovery, by Lorenzo Caselli and 3 other authors View PDF HTML (experimental) Abstract:Generalized Category Discovery (GCD) aims to identify novel categories in unlabeled data while leveraging a small labeled subset of known classes. Training a parametric classifier solely on image features often leads to overfitting to old classes, and recent multimodal approaches improve performance by incorporating textual information. However, they treat modalities independently and incur high computational cost. We propose SpectralGCD, an efficient and effective multimodal approach to GCD that uses CLIP cross-modal image-concept similarities as a unified cross-modal representation. Each image is expressed as a mixture over semantic concepts from a large task-agnostic dictionary, which anchors learning to explicit semantics and reduces reliance on spurious visual cues. To maintain the semantic quality of representations learned by an efficient student, we introduce Spectral Filtering which exploits a cross-modal covariance matrix over the softmaxed s...

Related Articles

Llms

World models will be the next big thing, bye-bye LLMs

Was at Nvidia's GTC conference recently and honestly, it was one of the most eye-opening events I've attended in a while. There was a lot...

Reddit - Artificial Intelligence · 1 min ·
Machine Learning

[D] Got my first offer after months of searching — below posted range, contract-to-hire, and worried it may pause my search. Do I take it?

I could really use some outside perspective. I’m a senior ML/CV engineer in Canada with about 5–6 years across research and industry. Mas...

Reddit - Machine Learning · 1 min ·
Machine Learning

[Research] AI training is bad, so I started an research

Hello, I started researching about AI training Q:Why? R: Because AI training is bad right now. Q: What do you mean its bad? R: Like when ...

Reddit - Machine Learning · 1 min ·
Machine Learning

[P] Unix philosophy for ML pipelines: modular, swappable stages with typed contracts

We built an open-source prototype that applies Unix philosophy to retrieval pipelines. Each stage (PII redaction, chunking, dedup, embedd...

Reddit - Machine Learning · 1 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime