[2604.08110] OV-Stitcher: A Global Context-Aware Framework for Training-Free Open-Vocabulary Semantic Segmentation

[2604.08110] OV-Stitcher: A Global Context-Aware Framework for Training-Free Open-Vocabulary Semantic Segmentation

arXiv - AI 3 min read

About this article

Abstract page for arXiv paper 2604.08110: OV-Stitcher: A Global Context-Aware Framework for Training-Free Open-Vocabulary Semantic Segmentation

Computer Science > Computer Vision and Pattern Recognition arXiv:2604.08110 (cs) [Submitted on 9 Apr 2026 (v1), last revised 10 Apr 2026 (this version, v2)] Title:OV-Stitcher: A Global Context-Aware Framework for Training-Free Open-Vocabulary Semantic Segmentation Authors:Seungjae Moon, Seunghyun Oh, Youngmin Ro View a PDF of the paper titled OV-Stitcher: A Global Context-Aware Framework for Training-Free Open-Vocabulary Semantic Segmentation, by Seungjae Moon and 2 other authors View PDF HTML (experimental) Abstract:Training-free open-vocabulary semantic segmentation(TF-OVSS) has recently attracted attention for its ability to perform dense prediction by leveraging the pretrained knowledge of large vision and vision-language models, without requiring additional training. However, due to the limited input resolution of these pretrained encoders, existing TF-OVSS methods commonly adopt a sliding-window strategy that processes cropped sub-images independently. While effective for managing high-resolution inputs, this approach prevents global attention over the full image, leading to fragmented feature representations and limited contextual reasoning. We propose OV-Stitcher, a training-free framework that addresses this limitation by stitching fragmented sub-image features directly within the final encoder block. By reconstructing attention representations from fragmented sub-image features, OV-Stitcher enables global attention within the final encoder block, producing cohere...

Originally published on April 13, 2026. Curated by AI News.

Related Articles

From LLMs to hallucinations, here’s a simple guide to common AI terms
Llms

From LLMs to hallucinations, here’s a simple guide to common AI terms

TechCrunch - AI · 19 min ·
[2604.08457] CrashSight: A Phase-Aware, Infrastructure-Centric Video Benchmark for Traffic Crash Scene Understanding and Reasoning
Llms

[2604.08457] CrashSight: A Phase-Aware, Infrastructure-Centric Video Benchmark for Traffic Crash Scene Understanding and Reasoning

Abstract page for arXiv paper 2604.08457: CrashSight: A Phase-Aware, Infrastructure-Centric Video Benchmark for Traffic Crash Scene Under...

arXiv - AI · 4 min ·
[2603.06665] Better Eyes, Better Thoughts: Why Vision Chain-of-Thought Fails in Medicine
Llms

[2603.06665] Better Eyes, Better Thoughts: Why Vision Chain-of-Thought Fails in Medicine

Abstract page for arXiv paper 2603.06665: Better Eyes, Better Thoughts: Why Vision Chain-of-Thought Fails in Medicine

arXiv - AI · 3 min ·
[2602.04674] Overstating Attitudes, Ignoring Networks: LLM Biases in Simulating Misinformation Susceptibility
Llms

[2602.04674] Overstating Attitudes, Ignoring Networks: LLM Biases in Simulating Misinformation Susceptibility

Abstract page for arXiv paper 2602.04674: Overstating Attitudes, Ignoring Networks: LLM Biases in Simulating Misinformation Susceptibility

arXiv - AI · 4 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime