[2506.22504] Patch2Loc: Learning to Localize Patches for Unsupervised Brain Lesion Detection
Abstract page for arXiv paper 2506.22504: Patch2Loc: Learning to Localize Patches for Unsupervised Brain Lesion Detection
Image recognition, detection, and visual AI
Abstract page for arXiv paper 2506.22504: Patch2Loc: Learning to Localize Patches for Unsupervised Brain Lesion Detection
Abstract page for arXiv paper 2508.00307: Acoustic Imaging for Low-SNR UAV Detection: Dense Beamformed Energy Maps and U-Net SELD
Abstract page for arXiv paper 2603.25524: CHIRP dataset: towards long-term, individual-level, behavioral monitoring of bird populations i...
The paper presents COMmunication inspired Tokenization (COMiT), a novel framework for structured image representations that enhances obje...
This paper presents an AI-driven methodology for segmenting straylight effects in space camera sensors, enhancing image analysis in resou...
This article explores the use of vision-language models (VLMs) for non-invasive ergonomic assessment of manual lifting tasks, estimating ...
The paper presents Dataset Color Quantization (DCQ), a framework designed to compress large-scale image datasets by reducing color-space ...
The paper presents SurgAtt-Tracker, a novel framework for online surgical attention tracking that enhances minimally invasive surgery thr...
This paper explores fair allocation of indivisible goods through limited cost-sensitive sharing, demonstrating how controlled sharing can...
This paper investigates how visual artifacts from diffusion-based inpainting affect language generation in vision-language models, reveal...
The paper introduces LESA, a framework for accelerating diffusion models using learnable stage-aware predictors, achieving significant sp...
This article presents a framework for circuit tracing in vision-language models (VLMs), aiming to enhance understanding of their internal...
This article presents a novel multimodal framework for human-robot interaction that integrates video and speech processing with large lan...
This paper explores the concept of 'Epistemic Debt' in novice programming using generative AI, proposing metacognitive scripts to enhance...
The paper presents OptimusVLA, a dual-memory framework for robotic manipulation that enhances efficiency and robustness in action generat...
The paper introduces AINet, a novel framework for whole slide image analysis that addresses regional heterogeneity through anchor instanc...
This article presents a novel approach to medical image classification using prototype learning and privileged information, enhancing int...
The paper presents NoRD, a data-efficient Vision-Language-Action model that enhances autonomous driving without requiring extensive datas...
This article introduces Vision-Language Causal Graphs (VLCGs) to enhance causal reasoning in Vision-Language Models (LVLMs), addressing t...
The paper introduces PyVision-RL, a reinforcement learning framework designed to enhance agentic multimodal models by preventing interact...
The Recursive Belief Vision Language Model (RB-VLA) addresses limitations in current vision-language-action models by introducing a belie...
A new study reveals that anesthetizing the retina of a 'lazy' eye for two days can restore vision in mice, offering hope for treating amb...
The rail sector is embracing AI to enhance data processing and operational efficiency, with initiatives like Great British Railways lever...
Get the latest news, tools, and insights delivered to your inbox.
Daily or weekly digest • Unsubscribe anytime