Computer Vision

Image recognition, detection, and visual AI

Top This Week

[2506.22504] Patch2Loc: Learning to Localize Patches for Unsupervised Brain Lesion Detection
Machine Learning

[2506.22504] Patch2Loc: Learning to Localize Patches for Unsupervised Brain Lesion Detection

Abstract page for arXiv paper 2506.22504: Patch2Loc: Learning to Localize Patches for Unsupervised Brain Lesion Detection

arXiv - Machine Learning · 4 min ·
[2508.00307] Acoustic Imaging for Low-SNR UAV Detection: Dense Beamformed Energy Maps and U-Net SELD
Machine Learning

[2508.00307] Acoustic Imaging for Low-SNR UAV Detection: Dense Beamformed Energy Maps and U-Net SELD

Abstract page for arXiv paper 2508.00307: Acoustic Imaging for Low-SNR UAV Detection: Dense Beamformed Energy Maps and U-Net SELD

arXiv - AI · 4 min ·
[2603.25524] CHIRP dataset: towards long-term, individual-level, behavioral monitoring of bird populations in the wild
Computer Vision

[2603.25524] CHIRP dataset: towards long-term, individual-level, behavioral monitoring of bird populations in the wild

Abstract page for arXiv paper 2603.25524: CHIRP dataset: towards long-term, individual-level, behavioral monitoring of bird populations i...

arXiv - AI · 4 min ·

All Content

[2505.15504] Exploiting Low-Dimensional Manifold of Features for Few-Shot Whole Slide Image Classification
Llms

[2505.15504] Exploiting Low-Dimensional Manifold of Features for Few-Shot Whole Slide Image Classification

Abstract page for arXiv paper 2505.15504: Exploiting Low-Dimensional Manifold of Features for Few-Shot Whole Slide Image Classification

arXiv - AI · 4 min ·
[2603.02142] Is Bigger Always Better? Efficiency Analysis in Resource-Constrained Small Object Detection
Machine Learning

[2603.02142] Is Bigger Always Better? Efficiency Analysis in Resource-Constrained Small Object Detection

Abstract page for arXiv paper 2603.02142: Is Bigger Always Better? Efficiency Analysis in Resource-Constrained Small Object Detection

arXiv - Machine Learning · 3 min ·
[2406.17297] Towards Camera Open-set 3D Object Detection for Autonomous Driving Scenarios
Machine Learning

[2406.17297] Towards Camera Open-set 3D Object Detection for Autonomous Driving Scenarios

Abstract page for arXiv paper 2406.17297: Towards Camera Open-set 3D Object Detection for Autonomous Driving Scenarios

arXiv - AI · 4 min ·
[2603.01214] Reasoning Boosts Opinion Alignment in LLMs
Llms

[2603.01214] Reasoning Boosts Opinion Alignment in LLMs

Abstract page for arXiv paper 2603.01214: Reasoning Boosts Opinion Alignment in LLMs

arXiv - Machine Learning · 3 min ·
[2603.00798] Efficient Conformal Volumetry for Template-Based Segmentation
Machine Learning

[2603.00798] Efficient Conformal Volumetry for Template-Based Segmentation

Abstract page for arXiv paper 2603.00798: Efficient Conformal Volumetry for Template-Based Segmentation

arXiv - Machine Learning · 3 min ·
[2603.02087] Detection-Gated Glottal Segmentation with Zero-Shot Cross-Dataset Transfer and Clinical Feature Extraction
Machine Learning

[2603.02087] Detection-Gated Glottal Segmentation with Zero-Shot Cross-Dataset Transfer and Clinical Feature Extraction

Abstract page for arXiv paper 2603.02087: Detection-Gated Glottal Segmentation with Zero-Shot Cross-Dataset Transfer and Clinical Feature...

arXiv - Machine Learning · 4 min ·
[2603.00163] A Boundary-Metric Evaluation Protocol for Whiteboard Stroke Segmentation Under Extreme Imbalance
Nlp

[2603.00163] A Boundary-Metric Evaluation Protocol for Whiteboard Stroke Segmentation Under Extreme Imbalance

Abstract page for arXiv paper 2603.00163: A Boundary-Metric Evaluation Protocol for Whiteboard Stroke Segmentation Under Extreme Imbalance

arXiv - Machine Learning · 4 min ·
[2603.00161] SKINOPATHY AI: Smartphone-Based Ophthalmic Screening and Longitudinal Tracking Using Lightweight Computer Vision
Computer Vision

[2603.00161] SKINOPATHY AI: Smartphone-Based Ophthalmic Screening and Longitudinal Tracking Using Lightweight Computer Vision

Abstract page for arXiv paper 2603.00161: SKINOPATHY AI: Smartphone-Based Ophthalmic Screening and Longitudinal Tracking Using Lightweigh...

arXiv - Machine Learning · 4 min ·
[2603.01947] physfusion: A Transformer-based Dual-Stream Radar and Vision Fusion Framework for Open Water Surface Object Detection
Machine Learning

[2603.01947] physfusion: A Transformer-based Dual-Stream Radar and Vision Fusion Framework for Open Water Surface Object Detection

Abstract page for arXiv paper 2603.01947: physfusion: A Transformer-based Dual-Stream Radar and Vision Fusion Framework for Open Water Su...

arXiv - AI · 4 min ·
[2603.00143] GrapHist: Graph Self-Supervised Learning for Histopathology
Machine Learning

[2603.00143] GrapHist: Graph Self-Supervised Learning for Histopathology

Abstract page for arXiv paper 2603.00143: GrapHist: Graph Self-Supervised Learning for Histopathology

arXiv - Machine Learning · 4 min ·
[2603.01602] YCDa: YCbCr Decoupled Attention for Real-time Realistic Camouflaged Object Detection
Computer Vision

[2603.01602] YCDa: YCbCr Decoupled Attention for Real-time Realistic Camouflaged Object Detection

Abstract page for arXiv paper 2603.01602: YCDa: YCbCr Decoupled Attention for Real-time Realistic Camouflaged Object Detection

arXiv - AI · 3 min ·
[2603.01361] MixerCSeg: An Efficient Mixer Architecture for Crack Segmentation via Decoupled Mamba Attention
Machine Learning

[2603.01361] MixerCSeg: An Efficient Mixer Architecture for Crack Segmentation via Decoupled Mamba Attention

Abstract page for arXiv paper 2603.01361: MixerCSeg: An Efficient Mixer Architecture for Crack Segmentation via Decoupled Mamba Attention

arXiv - AI · 4 min ·
[2603.01305] AG-VAS: Anchor-Guided Zero-Shot Visual Anomaly Segmentation with Large Multimodal Models
Machine Learning

[2603.01305] AG-VAS: Anchor-Guided Zero-Shot Visual Anomaly Segmentation with Large Multimodal Models

Abstract page for arXiv paper 2603.01305: AG-VAS: Anchor-Guided Zero-Shot Visual Anomaly Segmentation with Large Multimodal Models

arXiv - AI · 4 min ·
[2603.01295] Multi-Level Bidirectional Decoder Interaction for Uncertainty-Aware Breast Ultrasound Analysis
Computer Vision

[2603.01295] Multi-Level Bidirectional Decoder Interaction for Uncertainty-Aware Breast Ultrasound Analysis

Abstract page for arXiv paper 2603.01295: Multi-Level Bidirectional Decoder Interaction for Uncertainty-Aware Breast Ultrasound Analysis

arXiv - AI · 4 min ·
[2603.01250] The MAMA-MIA Challenge: Advancing Generalizability and Fairness in Breast MRI Tumor Segmentation and Treatment Response Prediction
Machine Learning

[2603.01250] The MAMA-MIA Challenge: Advancing Generalizability and Fairness in Breast MRI Tumor Segmentation and Treatment Response Prediction

Abstract page for arXiv paper 2603.01250: The MAMA-MIA Challenge: Advancing Generalizability and Fairness in Breast MRI Tumor Segmentatio...

arXiv - AI · 4 min ·
[2603.01224] Monocular 3D Object Position Estimation with VLMs for Human-Robot Interaction
Llms

[2603.01224] Monocular 3D Object Position Estimation with VLMs for Human-Robot Interaction

Abstract page for arXiv paper 2603.01224: Monocular 3D Object Position Estimation with VLMs for Human-Robot Interaction

arXiv - Machine Learning · 3 min ·
[2603.01348] UTICA: Multi-Objective Self-Distllation Foundation Model Pretraining for Time Series Classification
Llms

[2603.01348] UTICA: Multi-Objective Self-Distllation Foundation Model Pretraining for Time Series Classification

Abstract page for arXiv paper 2603.01348: UTICA: Multi-Objective Self-Distllation Foundation Model Pretraining for Time Series Classifica...

arXiv - Machine Learning · 3 min ·
[2603.00895] Evaluating AI Grading on Real-World Handwritten College Mathematics: A Large-Scale Study Toward a Benchmark
Llms

[2603.00895] Evaluating AI Grading on Real-World Handwritten College Mathematics: A Large-Scale Study Toward a Benchmark

Abstract page for arXiv paper 2603.00895: Evaluating AI Grading on Real-World Handwritten College Mathematics: A Large-Scale Study Toward...

arXiv - Machine Learning · 4 min ·
[2603.00787] Identifying the Geographic Foci of US Local News
Machine Learning

[2603.00787] Identifying the Geographic Foci of US Local News

Abstract page for arXiv paper 2603.00787: Identifying the Geographic Foci of US Local News

arXiv - Machine Learning · 4 min ·
[2603.00433] TAP-SLF: Parameter-Efficient Adaptation of Vision Foundation Models for Multi-Task Ultrasound Image Analysis
Llms

[2603.00433] TAP-SLF: Parameter-Efficient Adaptation of Vision Foundation Models for Multi-Task Ultrasound Image Analysis

Abstract page for arXiv paper 2603.00433: TAP-SLF: Parameter-Efficient Adaptation of Vision Foundation Models for Multi-Task Ultrasound I...

arXiv - AI · 4 min ·
Previous Page 7 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime