[2603.01305] AG-VAS: Anchor-Guided Zero-Shot Visual Anomaly

[2603.01305] AG-VAS: Anchor-Guided Zero-Shot Visual Anomaly Segmentation with Large Multimodal Models

arXiv - AI March 03, 2026 4 min read

About this article

Abstract page for arXiv paper 2603.01305: AG-VAS: Anchor-Guided Zero-Shot Visual Anomaly Segmentation with Large Multimodal Models

Computer Science > Computer Vision and Pattern Recognition arXiv:2603.01305 (cs) [Submitted on 1 Mar 2026] Title:AG-VAS: Anchor-Guided Zero-Shot Visual Anomaly Segmentation with Large Multimodal Models Authors:Zhen Qu, Xian Tao, Xiaoyi Bao, Dingrong Wang, ShiChen Qu, Zhengtao Zhang, Xingang Wang View a PDF of the paper titled AG-VAS: Anchor-Guided Zero-Shot Visual Anomaly Segmentation with Large Multimodal Models, by Zhen Qu and 6 other authors View PDF HTML (experimental) Abstract:Large multimodal models (LMMs) exhibit strong task generalization capabilities, offering new opportunities for zero-shot visual anomaly segmentation (ZSAS). However, existing LMM-based segmentation approaches still face fundamental limitations: anomaly concepts are inherently abstract and context-dependent, lacking stable visual prototypes, and the weak alignment between high-level semantic embeddings and pixel-level spatial features hinders precise anomaly localization. To address these challenges, we present AG-VAS (Anchor-Guided Visual Anomaly Segmentation), a new framework that expands the LMM vocabulary with three learnable semantic anchor tokens-[SEG], [NOR], and [ANO], establishing a unified anchor-guided segmentation paradigm. Specifically, [SEG] serves as an absolute semantic anchor that translates abstract anomaly semantics into explicit, spatially grounded visual entities (e.g., holes or scratches), while [NOR] and [ANO] act as relative anchors that model the contextual contrast betwe...

Originally published on March 03, 2026. Curated by AI News.

Llms

[R] GPT-5.4-mini regressed 22pp on vanilla prompting vs GPT-5-mini. Nobody noticed because benchmarks don't test this. Recursive Language Models solved it.

GPT-5.4-mini produces shorter, terser outputs by default. Vanilla accuracy dropped from 69.5% to 47.2% across 12 tasks (1,800 evals). The...

Reddit - Machine Learning · 1 min · about 3 hours ago

Ai Startups

Top 10 AI certifications and courses for 2026

This article reviews the top 10 AI certifications and courses for 2026, highlighting their significance in a rapidly evolving field and t...

AI Events · 15 min · about 3 hours ago

Machine Learning

Hub Group Using AI, Machine Learning for Real-Time Visibility of Shipments

Hub Group says it’s using artificial intelligence and machine learning to leverage data from its GPS-equipped container fleet to give cus...

AI Events · 4 min · about 3 hours ago

Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min · about 4 hours ago

[2603.01305] AG-VAS: Anchor-Guided Zero-Shot Visual Anomaly Segmentation with Large Multimodal Models

About this article

Related Articles

[R] GPT-5.4-mini regressed 22pp on vanilla prompting vs GPT-5-mini. Nobody noticed because benchmarks don't test this. Recursive Language Models solved it.

Top 10 AI certifications and courses for 2026

Hub Group Using AI, Machine Learning for Real-Time Visibility of Shipments

UMKC Announces New Master of Science in Artificial Intelligence

No comments

Stay updated with AI News