Computer Vision

Image recognition, detection, and visual AI

Top This Week

[2506.22504] Patch2Loc: Learning to Localize Patches for Unsupervised Brain Lesion Detection
Machine Learning

[2506.22504] Patch2Loc: Learning to Localize Patches for Unsupervised Brain Lesion Detection

Abstract page for arXiv paper 2506.22504: Patch2Loc: Learning to Localize Patches for Unsupervised Brain Lesion Detection

arXiv - Machine Learning · 4 min ·
[2508.00307] Acoustic Imaging for Low-SNR UAV Detection: Dense Beamformed Energy Maps and U-Net SELD
Machine Learning

[2508.00307] Acoustic Imaging for Low-SNR UAV Detection: Dense Beamformed Energy Maps and U-Net SELD

Abstract page for arXiv paper 2508.00307: Acoustic Imaging for Low-SNR UAV Detection: Dense Beamformed Energy Maps and U-Net SELD

arXiv - AI · 4 min ·
[2603.25524] CHIRP dataset: towards long-term, individual-level, behavioral monitoring of bird populations in the wild
Computer Vision

[2603.25524] CHIRP dataset: towards long-term, individual-level, behavioral monitoring of bird populations in the wild

Abstract page for arXiv paper 2603.25524: CHIRP dataset: towards long-term, individual-level, behavioral monitoring of bird populations i...

arXiv - AI · 4 min ·

All Content

[2603.00368] Deep Learning-Based Meat Freshness Detection with Segmentation and OOD-Aware Classification
Machine Learning

[2603.00368] Deep Learning-Based Meat Freshness Detection with Segmentation and OOD-Aware Classification

Abstract page for arXiv paper 2603.00368: Deep Learning-Based Meat Freshness Detection with Segmentation and OOD-Aware Classification

arXiv - Machine Learning · 4 min ·
[2603.00184] Zero-Shot and Supervised Bird Image Segmentation Using Foundation Models: A Dual-Pipeline Approach with Grounding DINO~1.5, YOLOv11, and SAM~2.1
Llms

[2603.00184] Zero-Shot and Supervised Bird Image Segmentation Using Foundation Models: A Dual-Pipeline Approach with Grounding DINO~1.5, YOLOv11, and SAM~2.1

Abstract page for arXiv paper 2603.00184: Zero-Shot and Supervised Bird Image Segmentation Using Foundation Models: A Dual-Pipeline Appro...

arXiv - AI · 4 min ·
[2603.00160] DINOv3 Meets YOLO26 for Weed Detection in Vegetable Crops
Machine Learning

[2603.00160] DINOv3 Meets YOLO26 for Weed Detection in Vegetable Crops

Abstract page for arXiv paper 2603.00160: DINOv3 Meets YOLO26 for Weed Detection in Vegetable Crops

arXiv - AI · 4 min ·
[2603.00152] Dr. Seg: Revisiting GRPO Training for Visual Large Language Models through Perception-Oriented Design
Llms

[2603.00152] Dr. Seg: Revisiting GRPO Training for Visual Large Language Models through Perception-Oriented Design

Abstract page for arXiv paper 2603.00152: Dr. Seg: Revisiting GRPO Training for Visual Large Language Models through Perception-Oriented ...

arXiv - AI · 4 min ·
[2603.00136] TinyVLM: Zero-Shot Object Detection on Microcontrollers via Vision-Language Distillation with Matryoshka Embeddings
Llms

[2603.00136] TinyVLM: Zero-Shot Object Detection on Microcontrollers via Vision-Language Distillation with Matryoshka Embeddings

Abstract page for arXiv paper 2603.00136: TinyVLM: Zero-Shot Object Detection on Microcontrollers via Vision-Language Distillation with M...

arXiv - AI · 3 min ·
[2603.00122] NovaLAD: A Fast, CPU-Optimized Document Extraction Pipeline for Generative AI and Data Intelligence
Machine Learning

[2603.00122] NovaLAD: A Fast, CPU-Optimized Document Extraction Pipeline for Generative AI and Data Intelligence

Abstract page for arXiv paper 2603.00122: NovaLAD: A Fast, CPU-Optimized Document Extraction Pipeline for Generative AI and Data Intellig...

arXiv - AI · 4 min ·
[2603.00124] OrthoAI: A Lightweight Deep Learning Framework for Automated Biomechanical Analysis in Clear Aligner Orthodontics -- A Methodological Proof-of-Concept
Machine Learning

[2603.00124] OrthoAI: A Lightweight Deep Learning Framework for Automated Biomechanical Analysis in Clear Aligner Orthodontics -- A Methodological Proof-of-Concept

Abstract page for arXiv paper 2603.00124: OrthoAI: A Lightweight Deep Learning Framework for Automated Biomechanical Analysis in Clear Al...

arXiv - AI · 4 min ·
[2603.01554] S5-HES Agent: Society 5.0-driven Agentic Framework to Democratize Smart Home Environment Simulation
Computer Vision

[2603.01554] S5-HES Agent: Society 5.0-driven Agentic Framework to Democratize Smart Home Environment Simulation

Abstract page for arXiv paper 2603.01554: S5-HES Agent: Society 5.0-driven Agentic Framework to Democratize Smart Home Environment Simula...

arXiv - AI · 4 min ·
[2510.04883] CLEAR-IR: Clarity-Enhanced Active Reconstruction of Infrared Imagery
Computer Vision

[2510.04883] CLEAR-IR: Clarity-Enhanced Active Reconstruction of Infrared Imagery

Abstract page for arXiv paper 2510.04883: CLEAR-IR: Clarity-Enhanced Active Reconstruction of Infrared Imagery

arXiv - Machine Learning · 3 min ·
[2508.01941] Less is More: AMBER-AFNO -- a New Benchmark for Lightweight 3D Medical Image Segmentation
Machine Learning

[2508.01941] Less is More: AMBER-AFNO -- a New Benchmark for Lightweight 3D Medical Image Segmentation

Abstract page for arXiv paper 2508.01941: Less is More: AMBER-AFNO -- a New Benchmark for Lightweight 3D Medical Image Segmentation

arXiv - Machine Learning · 4 min ·
[2602.24222] MuViT: Multi-Resolution Vision Transformers for Learning Across Scales in Microscopy
Machine Learning

[2602.24222] MuViT: Multi-Resolution Vision Transformers for Learning Across Scales in Microscopy

Abstract page for arXiv paper 2602.24222: MuViT: Multi-Resolution Vision Transformers for Learning Across Scales in Microscopy

arXiv - Machine Learning · 3 min ·
[2602.24183] A multimodal slice discovery framework for systematic failure detection and explanation in medical image classification
Machine Learning

[2602.24183] A multimodal slice discovery framework for systematic failure detection and explanation in medical image classification

Abstract page for arXiv paper 2602.24183: A multimodal slice discovery framework for systematic failure detection and explanation in medi...

arXiv - Machine Learning · 3 min ·
[2602.24159] RAViT: Resolution-Adaptive Vision Transformer
Machine Learning

[2602.24159] RAViT: Resolution-Adaptive Vision Transformer

Abstract page for arXiv paper 2602.24159: RAViT: Resolution-Adaptive Vision Transformer

arXiv - Machine Learning · 4 min ·
[2602.23903] SegMate: Asymmetric Attention-Based Lightweight Architecture for Efficient Multi-Organ Segmentation
Machine Learning

[2602.23903] SegMate: Asymmetric Attention-Based Lightweight Architecture for Efficient Multi-Organ Segmentation

Abstract page for arXiv paper 2602.23903: SegMate: Asymmetric Attention-Based Lightweight Architecture for Efficient Multi-Organ Segmenta...

arXiv - Machine Learning · 3 min ·
[2602.23533] Few-Shot Continual Learning for 3D Brain MRI with Frozen Foundation Models
Llms

[2602.23533] Few-Shot Continual Learning for 3D Brain MRI with Frozen Foundation Models

Abstract page for arXiv paper 2602.23533: Few-Shot Continual Learning for 3D Brain MRI with Frozen Foundation Models

arXiv - Machine Learning · 4 min ·
[2602.23390] Pacing Opinion Polarization via Graph Reinforcement Learning
Machine Learning

[2602.23390] Pacing Opinion Polarization via Graph Reinforcement Learning

Abstract page for arXiv paper 2602.23390: Pacing Opinion Polarization via Graph Reinforcement Learning

arXiv - Machine Learning · 3 min ·
[2602.24138] Multimodal Optimal Transport for Unsupervised Temporal Segmentation in Surgical Robotics
Machine Learning

[2602.24138] Multimodal Optimal Transport for Unsupervised Temporal Segmentation in Surgical Robotics

Abstract page for arXiv paper 2602.24138: Multimodal Optimal Transport for Unsupervised Temporal Segmentation in Surgical Robotics

arXiv - AI · 4 min ·
[2602.23916] The Geometry of Transfer: Unlocking Medical Vision Manifolds for Training-Free Model Ranking
Llms

[2602.23916] The Geometry of Transfer: Unlocking Medical Vision Manifolds for Training-Free Model Ranking

Abstract page for arXiv paper 2602.23916: The Geometry of Transfer: Unlocking Medical Vision Manifolds for Training-Free Model Ranking

arXiv - AI · 4 min ·
[2602.23575] CycleBEV: Regularizing View Transformation Networks via View Cycle Consistency for Bird's-Eye-View Semantic Segmentation
Machine Learning

[2602.23575] CycleBEV: Regularizing View Transformation Networks via View Cycle Consistency for Bird's-Eye-View Semantic Segmentation

Abstract page for arXiv paper 2602.23575: CycleBEV: Regularizing View Transformation Networks via View Cycle Consistency for Bird's-Eye-V...

arXiv - AI · 4 min ·
[2602.23514] Modelling and Simulation of Neuromorphic Datasets for Anomaly Detection in Computer Vision
Machine Learning

[2602.23514] Modelling and Simulation of Neuromorphic Datasets for Anomaly Detection in Computer Vision

Abstract page for arXiv paper 2602.23514: Modelling and Simulation of Neuromorphic Datasets for Anomaly Detection in Computer Vision

arXiv - Machine Learning · 4 min ·
Previous Page 8 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime