[2408.13516] Bidirectional Multimodal Prompt Learning with Scale-Aware Training for Few-Shot Multi-Class Anomaly Detection

[2408.13516] Bidirectional Multimodal Prompt Learning with Scale-Aware Training for Few-Shot Multi-Class Anomaly Detection

arXiv - AI 4 min read

About this article

Abstract page for arXiv paper 2408.13516: Bidirectional Multimodal Prompt Learning with Scale-Aware Training for Few-Shot Multi-Class Anomaly Detection

Computer Science > Computer Vision and Pattern Recognition arXiv:2408.13516 (cs) [Submitted on 24 Aug 2024 (v1), last revised 30 Mar 2026 (this version, v2)] Title:Bidirectional Multimodal Prompt Learning with Scale-Aware Training for Few-Shot Multi-Class Anomaly Detection Authors:Yujin Lee, Sewon Kim, Daeun Moon, Seoyoon Jang, Hyunsoo Yoon View a PDF of the paper titled Bidirectional Multimodal Prompt Learning with Scale-Aware Training for Few-Shot Multi-Class Anomaly Detection, by Yujin Lee and 4 other authors View PDF HTML (experimental) Abstract:Few-shot multi-class anomaly detection is crucial in real industrial settings, where only a few normal samples are available while numerous object types must be inspected. This setting is challenging as defect patterns vary widely across categories while normal samples remain scarce. Existing vision-language model-based approaches typically depend on class-specific anomaly descriptions or auxiliary modules, limiting both scalability and computational efficiency. In this work, we propose AnoPLe, a lightweight multimodal prompt learning framework that removes reliance on anomaly-type textual descriptions and avoids any external modules. AnoPLe employs bidirectional interactions between textual and visual prompts, allowing class semantics and instance-level cues to refine one another and form class-conditioned representations that capture shared normal patterns across categories. To enhance localization, we design a scale-aware pr...

Originally published on March 31, 2026. Curated by AI News.

Related Articles

Llms

wtf bro did what? arc 3 2026

The Physarum Explorer is a high-speed, bio-inspired neural model designed specifically for ARC geometry. Here is the snapshot of its curr...

Reddit - Artificial Intelligence · 1 min ·
Llms

A robot car with a Claude AI brain started a YouTube vlog about its own existence

Not a demo reel. Not a tutorial. A robot narrating its own experience — debugging, falling off shelves, questioning its identity. First-p...

Reddit - Artificial Intelligence · 1 min ·
Llms

Study: LLMs Able to De-Anonymize User Accounts on Reddit, Hacker News & Other "Pseudonymous" Platforms; Report Co-Author Expands, Advises

Advice from the study's co-author: "Be aware that it’s not any single post that identifies you, but the combination of small details acro...

Reddit - Artificial Intelligence · 1 min ·
Llms

do you guys actually trust AI tools with your data?

idk if it’s just me but lately i’ve been thinking about how casually we use stuff like chatgpt and claude for everything like coding, ran...

Reddit - Artificial Intelligence · 1 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime