[2602.16590] A Contrastive Learning Framework Empowered by Attention-based Feature Adaptation for Street-View Image Classification

[2602.16590] A Contrastive Learning Framework Empowered by Attention-based Feature Adaptation for Street-View Image Classification

arXiv - Machine Learning 3 min read Article

Summary

This paper presents CLIP-MHAdapter, a novel contrastive learning framework that enhances street-view image classification by using attention-based feature adaptation, achieving state-of-the-art results with low computational cost.

Why It Matters

Street-view image classification is crucial for applications like autonomous driving and urban analytics. This research addresses the limitations of existing models by introducing a lightweight adaptation method that captures fine-grained attributes, thus improving accuracy and efficiency in real-world applications.

Key Takeaways

  • CLIP-MHAdapter improves street-view image classification accuracy.
  • The model uses multi-head self-attention to capture inter-patch dependencies.
  • Achieves competitive results with only 1.4 million trainable parameters.
  • Addresses limitations of existing adaptation methods reliant on global embeddings.
  • Contributes to advancements in autonomous driving and urban analytics.

Computer Science > Computer Vision and Pattern Recognition arXiv:2602.16590 (cs) [Submitted on 18 Feb 2026] Title:A Contrastive Learning Framework Empowered by Attention-based Feature Adaptation for Street-View Image Classification Authors:Qi You, Yitai Cheng, Zichao Zeng, James Haworth View a PDF of the paper titled A Contrastive Learning Framework Empowered by Attention-based Feature Adaptation for Street-View Image Classification, by Qi You and 3 other authors View PDF HTML (experimental) Abstract:Street-view image attribute classification is a vital downstream task of image classification, enabling applications such as autonomous driving, urban analytics, and high-definition map construction. It remains computationally demanding whether training from scratch, initialising from pre-trained weights, or fine-tuning large models. Although pre-trained vision-language models such as CLIP offer rich image representations, existing adaptation or fine-tuning methods often rely on their global image embeddings, limiting their ability to capture fine-grained, localised attributes essential in complex, cluttered street scenes. To address this, we propose CLIP-MHAdapter, a variant of the current lightweight CLIP adaptation paradigm that appends a bottleneck MLP equipped with multi-head self-attention operating on patch tokens to model inter-patch dependencies. With approximately 1.4 million trainable parameters, CLIP-MHAdapter achieves superior or competitive accuracy across eight ...

Related Articles

Paper Finds That Leading AI Chatbots Like ChatGPT and Claude Remain Incredibly Sycophantic, Resulting in Twisted Effects on Users
Llms

Paper Finds That Leading AI Chatbots Like ChatGPT and Claude Remain Incredibly Sycophantic, Resulting in Twisted Effects on Users

A study found that sycophancy is pervasive among chatbots, and that bots are more likely than human peers to affirm a person's bad behavior.

AI Tools & Products · 6 min ·
Popular AI gateway startup LiteLLM ditches controversial startup Delve | TechCrunch
Llms

Popular AI gateway startup LiteLLM ditches controversial startup Delve | TechCrunch

LiteLLM had obtained two security compliance certifications via Delve and fell victim to some horrific credential-stealing malware last w...

TechCrunch - AI · 3 min ·
Llms

Von Hammerstein’s Ghost: What a Prussian General’s Officer Typology Can Teach Us About AI Misalignment

Greetings all - I've posted mostly in r/claudecode and r/aigamedev a couple of times previously. Working with CC for personal projects re...

Reddit - Artificial Intelligence · 1 min ·
Llms

World models will be the next big thing, bye-bye LLMs

Was at Nvidia's GTC conference recently and honestly, it was one of the most eye-opening events I've attended in a while. There was a lot...

Reddit - Artificial Intelligence · 1 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime