Computer Vision

Image recognition, detection, and visual AI

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Machine Learning

[2506.22504] Patch2Loc: Learning to Localize Patches for Unsupervised Brain Lesion Detection

Abstract page for arXiv paper 2506.22504: Patch2Loc: Learning to Localize Patches for Unsupervised Brain Lesion Detection

arXiv - Machine Learning · 4 min · about 10 hours ago

Machine Learning

[2508.00307] Acoustic Imaging for Low-SNR UAV Detection: Dense Beamformed Energy Maps and U-Net SELD

Abstract page for arXiv paper 2508.00307: Acoustic Imaging for Low-SNR UAV Detection: Dense Beamformed Energy Maps and U-Net SELD

arXiv - AI · 4 min · about 10 hours ago

Computer Vision

[2603.25524] CHIRP dataset: towards long-term, individual-level, behavioral monitoring of bird populations in the wild

Abstract page for arXiv paper 2603.25524: CHIRP dataset: towards long-term, individual-level, behavioral monitoring of bird populations i...

arXiv - AI · 4 min · about 10 hours ago

All Content

Llms

[2603.20020] Detached Skip-Links and $R$-Probe: Decoupling Feature Aggregation from Gradient Propagation for MLLM OCR

Abstract page for arXiv paper 2603.20020: Detached Skip-Links and $R$-Probe: Decoupling Feature Aggregation from Gradient Propagation for...

arXiv - AI · 4 min · 4 days ago

Computer Vision

[2603.19788] Learning Hierarchical Orthogonal Prototypes for Generalized Few-Shot 3D Point Cloud Segmentation

Abstract page for arXiv paper 2603.19788: Learning Hierarchical Orthogonal Prototypes for Generalized Few-Shot 3D Point Cloud Segmentation

arXiv - AI · 3 min · 4 days ago

Machine Learning

[2603.19757] Uncertainty-aware Prototype Learning with Variational Inference for Few-shot Point Cloud Segmentation

Abstract page for arXiv paper 2603.19757: Uncertainty-aware Prototype Learning with Variational Inference for Few-shot Point Cloud Segmen...

arXiv - AI · 4 min · 4 days ago

Machine Learning

[2603.19563] Dual-Domain Representation Alignment: Bridging 2D and 3D Vision via Geometry-Aware Architecture Search

Abstract page for arXiv paper 2603.19563: Dual-Domain Representation Alignment: Bridging 2D and 3D Vision via Geometry-Aware Architecture...

arXiv - AI · 4 min · 4 days ago

Llms

[2603.19531] dinov3.seg: Open-Vocabulary Semantic Segmentation with DINOv3

Abstract page for arXiv paper 2603.19531: dinov3.seg: Open-Vocabulary Semantic Segmentation with DINOv3

arXiv - AI · 4 min · 4 days ago

Machine Learning

[2603.04720] A Benchmark Study of Neural Network Compression Methods for Hyperspectral Image Classification

Abstract page for arXiv paper 2603.04720: A Benchmark Study of Neural Network Compression Methods for Hyperspectral Image Classification

arXiv - Machine Learning · 3 min · 21 days ago

Machine Learning

[2603.05423] An interpretable prototype parts-based neural network for medical tabular data

Abstract page for arXiv paper 2603.05423: An interpretable prototype parts-based neural network for medical tabular data

arXiv - Machine Learning · 3 min · 21 days ago

Computer Vision

[2603.05067] Synchronization-based clustering on the unit hypersphere

Abstract page for arXiv paper 2603.05067: Synchronization-based clustering on the unit hypersphere

arXiv - Machine Learning · 3 min · 21 days ago

Machine Learning

[2511.14599] CCSD: Cross-Modal Compositional Self-Distillation for Robust Brain Tumor Segmentation with Missing Modalities

Abstract page for arXiv paper 2511.14599: CCSD: Cross-Modal Compositional Self-Distillation for Robust Brain Tumor Segmentation with Miss...

arXiv - AI · 4 min · 21 days ago

Machine Learning

[2603.05276] Whispering to a Blackbox: Bootstrapping Frozen OCR with Visual Prompts

Abstract page for arXiv paper 2603.05276: Whispering to a Blackbox: Bootstrapping Frozen OCR with Visual Prompts

arXiv - Machine Learning · 4 min · 21 days ago

Machine Learning

[2603.05114] UniPAR: A Unified Framework for Pedestrian Attribute Recognition

Abstract page for arXiv paper 2603.05114: UniPAR: A Unified Framework for Pedestrian Attribute Recognition

arXiv - AI · 4 min · 21 days ago

Machine Learning

[2603.04811] Meta-D: Metadata-Aware Architectures for Brain Tumor Analysis and Missing-Modality Segmentation

Abstract page for arXiv paper 2603.04811: Meta-D: Metadata-Aware Architectures for Brain Tumor Analysis and Missing-Modality Segmentation

arXiv - AI · 3 min · 21 days ago

Machine Learning

[2603.04796] Comparative Evaluation of Traditional Methods and Deep Learning for Brain Glioma Imaging. Review Paper

Abstract page for arXiv paper 2603.04796: Comparative Evaluation of Traditional Methods and Deep Learning for Brain Glioma Imaging. Revie...

arXiv - AI · 4 min · 21 days ago

Machine Learning

[2603.04795] LAW & ORDER: Adaptive Spatial Weighting for Medical Diffusion and Segmentation

Abstract page for arXiv paper 2603.04795: LAW & ORDER: Adaptive Spatial Weighting for Medical Diffusion and Segmentation

arXiv - AI · 4 min · 21 days ago

Machine Learning

[P] On-device speech toolkit for Apple Silicon — ASR, TTS, diarization, speech-to-speech, all in native Swift

Open-source Swift package running 11 speech models on Apple Silicon via MLX (GPU) and CoreML (Neural Engine). Fully local inference, no c...

Reddit - Machine Learning · 1 min · 21 days ago

Computer Vision

[2509.16677] Segment-to-Act: Label-Noise-Robust Action-Prompted Video Segmentation Towards Embodied Intelligence

Abstract page for arXiv paper 2509.16677: Segment-to-Act: Label-Noise-Robust Action-Prompted Video Segmentation Towards Embodied Intellig...

arXiv - Machine Learning · 4 min · 22 days ago

Machine Learning

[2503.17110] Beyond Accuracy: What Matters in Designing Well-Behaved Image Classification Models?

Abstract page for arXiv paper 2503.17110: Beyond Accuracy: What Matters in Designing Well-Behaved Image Classification Models?

arXiv - Machine Learning · 4 min · 22 days ago

Llms

[2509.24222] Uni-NTFM: A Unified Foundation Model for EEG Signal Representation Learning

Abstract page for arXiv paper 2509.24222: Uni-NTFM: A Unified Foundation Model for EEG Signal Representation Learning

arXiv - AI · 4 min · 22 days ago

Machine Learning

[2503.03141] Implicit U-KAN2.0: Dynamic, Efficient and Interpretable Medical Image Segmentation

Abstract page for arXiv paper 2503.03141: Implicit U-KAN2.0: Dynamic, Efficient and Interpretable Medical Image Segmentation

arXiv - Machine Learning · 4 min · 22 days ago

Machine Learning

[2411.19888] FlowCLAS: Enhancing Normalizing Flow Via Contrastive Learning For Anomaly Segmentation

Abstract page for arXiv paper 2411.19888: FlowCLAS: Enhancing Normalizing Flow Via Contrastive Learning For Anomaly Segmentation

arXiv - Machine Learning · 4 min · 22 days ago

Previous Page 4 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Computer Vision

Top This Week

[2506.22504] Patch2Loc: Learning to Localize Patches for Unsupervised Brain Lesion Detection

[2508.00307] Acoustic Imaging for Low-SNR UAV Detection: Dense Beamformed Energy Maps and U-Net SELD

[2603.25524] CHIRP dataset: towards long-term, individual-level, behavioral monitoring of bird populations in the wild

All Content

[2603.20020] Detached Skip-Links and $R$-Probe: Decoupling Feature Aggregation from Gradient Propagation for MLLM OCR

[2603.19788] Learning Hierarchical Orthogonal Prototypes for Generalized Few-Shot 3D Point Cloud Segmentation

[2603.19757] Uncertainty-aware Prototype Learning with Variational Inference for Few-shot Point Cloud Segmentation

[2603.19563] Dual-Domain Representation Alignment: Bridging 2D and 3D Vision via Geometry-Aware Architecture Search

[2603.19531] dinov3.seg: Open-Vocabulary Semantic Segmentation with DINOv3

[2603.04720] A Benchmark Study of Neural Network Compression Methods for Hyperspectral Image Classification

[2603.05423] An interpretable prototype parts-based neural network for medical tabular data

[2603.05067] Synchronization-based clustering on the unit hypersphere

[2511.14599] CCSD: Cross-Modal Compositional Self-Distillation for Robust Brain Tumor Segmentation with Missing Modalities

[2603.05276] Whispering to a Blackbox: Bootstrapping Frozen OCR with Visual Prompts

[2603.05114] UniPAR: A Unified Framework for Pedestrian Attribute Recognition

[2603.04811] Meta-D: Metadata-Aware Architectures for Brain Tumor Analysis and Missing-Modality Segmentation

[2603.04796] Comparative Evaluation of Traditional Methods and Deep Learning for Brain Glioma Imaging. Review Paper

[2603.04795] LAW & ORDER: Adaptive Spatial Weighting for Medical Diffusion and Segmentation

[P] On-device speech toolkit for Apple Silicon — ASR, TTS, diarization, speech-to-speech, all in native Swift

[2509.16677] Segment-to-Act: Label-Noise-Robust Action-Prompted Video Segmentation Towards Embodied Intelligence

[2503.17110] Beyond Accuracy: What Matters in Designing Well-Behaved Image Classification Models?

[2509.24222] Uni-NTFM: A Unified Foundation Model for EEG Signal Representation Learning

[2503.03141] Implicit U-KAN2.0: Dynamic, Efficient and Interpretable Medical Image Segmentation

[2411.19888] FlowCLAS: Enhancing Normalizing Flow Via Contrastive Learning For Anomaly Segmentation

Related Topics

Stay updated with AI News