[2506.22504] Patch2Loc: Learning to Localize Patches for Unsupervised Brain Lesion Detection
Abstract page for arXiv paper 2506.22504: Patch2Loc: Learning to Localize Patches for Unsupervised Brain Lesion Detection
Image recognition, detection, and visual AI
Abstract page for arXiv paper 2506.22504: Patch2Loc: Learning to Localize Patches for Unsupervised Brain Lesion Detection
Abstract page for arXiv paper 2508.00307: Acoustic Imaging for Low-SNR UAV Detection: Dense Beamformed Energy Maps and U-Net SELD
Abstract page for arXiv paper 2603.25524: CHIRP dataset: towards long-term, individual-level, behavioral monitoring of bird populations i...
The paper presents Squint, a novel visual reinforcement learning method that enhances training efficiency for sim-to-real robotics, achie...
The LUMEN model enhances radiological diagnosis by leveraging longitudinal imaging data and multi-modal training, improving prognostic ca...
The paper introduces SpatiaLQA, a benchmark for evaluating spatial logical reasoning in Vision-Language Models (VLMs), highlighting their...
This article presents a novel approach to open-world object detection through an interpretable framework that enhances the identification...
This paper demonstrates that standard Transformers can achieve the minimax optimal rate in nonparametric regression for Hölder functions,...
The paper discusses a novel approach to incentive-compatible exploration in bandit settings, addressing the misalignment between principa...
The paper presents VISION-ICE, an AI framework utilizing intracardiac echocardiography to identify arrhythmia origins, achieving 66.2% ac...
This article evaluates the use of DeepSpeed to enhance the scalability of Vision Transformers (ViTs) for image-centric workloads, focusin...
The paper presents a novel estimator for model evidence in Bayesian inverse problems, particularly using diffusion models, enhancing samp...
The paper introduces Momentum Guidance (MG), a novel technique for enhancing flow-based generative models, achieving significant improvem...
QuantVLA introduces a novel post-training quantization framework for Vision-Language-Action models, enhancing efficiency without addition...
The paper introduces SAS-Net, a novel framework for robust spatiotemporal registration in bidirectional photoacoustic microscopy, address...
The UI-Venus-1.5 Technical Report presents advancements in GUI agents, detailing a unified model that enhances task performance across va...
CryoLVM introduces a self-supervised learning model for cryo-electron microscopy (cryo-EM) density maps, enhancing structural representat...
This article presents MetamerGen, a novel tool that generates metamers of human scene understanding by combining low-resolution gist info...
Molmo2 introduces a new family of open-weight vision-language models that excel in video understanding and grounding, featuring innovativ...
The paper presents Fast-ThinkAct, a novel framework for efficient Vision-Language-Action reasoning that reduces inference latency while m...
CogFlow introduces a novel framework for visual mathematical problem solving, enhancing perception and reasoning through knowledge intern...
This article presents a novel data-efficient approach for fine-tuning text-to-video generation models, demonstrating that low-quality syn...
The paper presents VCFlow, a novel architecture for subject-agnostic brain visual decoding, enhancing the reconstruction of visual experi...
Get the latest news, tools, and insights delivered to your inbox.
Daily or weekly digest • Unsubscribe anytime