Computer Vision

Image recognition, detection, and visual AI

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Machine Learning

[2506.22504] Patch2Loc: Learning to Localize Patches for Unsupervised Brain Lesion Detection

Abstract page for arXiv paper 2506.22504: Patch2Loc: Learning to Localize Patches for Unsupervised Brain Lesion Detection

arXiv - Machine Learning · 4 min · 3 days ago

Machine Learning

[2508.00307] Acoustic Imaging for Low-SNR UAV Detection: Dense Beamformed Energy Maps and U-Net SELD

Abstract page for arXiv paper 2508.00307: Acoustic Imaging for Low-SNR UAV Detection: Dense Beamformed Energy Maps and U-Net SELD

arXiv - AI · 4 min · 3 days ago

Computer Vision

[2603.25524] CHIRP dataset: towards long-term, individual-level, behavioral monitoring of bird populations in the wild

Abstract page for arXiv paper 2603.25524: CHIRP dataset: towards long-term, individual-level, behavioral monitoring of bird populations i...

arXiv - AI · 4 min · 3 days ago

All Content

Ai Startups

Niantic: Bringing spatial intelligence to the industrial edge

Niantic is leveraging spatial intelligence to enhance industrial applications, focusing on integrating augmented reality with real-world ...

Reddit - Artificial Intelligence · 1 min · about 1 month ago

Computer Vision

I geolocated a blurry pic from the Paris protests down to the exact coordinates using AI

The article discusses the author's successful use of an AI geolocation tool to pinpoint the exact coordinates of a blurry image from the ...

Reddit - Artificial Intelligence · 1 min · about 1 month ago

Machine Learning

[2509.11791] Synthetic vs. Real Training Data for Visual Navigation

This paper examines the effectiveness of visual navigation policies trained in simulation versus those trained with real-world data, high...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2509.07477] MedicalPatchNet: A Patch-Based Self-Explainable AI Architecture for Chest X-ray Classification

MedicalPatchNet introduces a self-explainable AI architecture for chest X-ray classification, enhancing interpretability while maintainin...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2601.02439] WebGym: Scaling Training Environments for Visual Web Agents with Realistic Tasks

WebGym is an innovative open-source environment designed for training visual web agents, featuring nearly 300,000 tasks and a high-throug...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2511.15487] NTK-Guided Implicit Neural Teaching

The paper presents NTK-Guided Implicit Neural Teaching (NINT), a method that accelerates training of Implicit Neural Representations (INR...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2501.16443] Object-Centric World Models from Few-Shot Annotations for Sample-Efficient Reinforcement Learning

The paper presents OC-STORM, an object-centric model-based reinforcement learning framework that enhances sample efficiency by leveraging...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.21873] GFPL: Generative Federated Prototype Learning for Resource-Constrained and Data-Imbalanced Vision Task

The GFPL framework enhances federated learning by addressing data imbalance and communication overhead in resource-constrained vision tas...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.21707] Learning spatially adaptive sparsity level maps for arbitrary convolutional dictionaries

This paper presents a novel approach to image reconstruction using spatially adaptive sparsity level maps within convolutional dictionari...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.21703] Brain Tumor Segmentation with Special Emphasis on the Non-Enhancing Brain Tumor Compartment

This article presents a U-Net based deep learning architecture for segmenting brain tumors in MRI scans, focusing on the often-overlooked...

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2602.21428] PSF-Med: Measuring and Explaining Paraphrase Sensitivity in Medical Vision Language Models

The paper introduces PSF-Med, a benchmark assessing paraphrase sensitivity in medical vision language models, revealing significant varia...

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2602.21397] MMLoP: Multi-Modal Low-Rank Prompting for Efficient Vision-Language Adaptation

The paper presents MMLoP, a framework for efficient vision-language adaptation using low-rank prompting, achieving high accuracy with sig...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.21961] Robustness in sparse artificial neural networks trained with adaptive topology

This paper explores the robustness of sparse artificial neural networks with adaptive topology, demonstrating their competitive performan...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2602.21601] Deep Clustering based Boundary-Decoder Net for Inter and Intra Layer Stress Prediction of Heterogeneous Integrated IC Chip

This article presents a novel approach using a Deep Clustering based Boundary-Decoder Net for predicting inter and intra-layer stress in ...

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2602.21593] Breaking Semantic-Aware Watermarks via LLM-Guided Coherence-Preserving Semantic Injection

The paper introduces a novel attack method, Coherence-Preserving Semantic Injection (CSI), that exploits vulnerabilities in semantic-awar...

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2602.10359] Beyond Calibration: Confounding Pathology Limits Foundation Model Specificity in Abdominal Trauma CT

This study evaluates the performance of foundation models in detecting abdominal trauma, revealing that specificity deficits are influenc...

arXiv - AI · 4 min · about 1 month ago

Machine Learning

[2602.09929] Monocular Normal Estimation via Shading Sequence Estimation

This paper presents a novel approach to monocular normal estimation by reformulating the problem as shading sequence estimation, enhancin...

arXiv - AI · 4 min · about 1 month ago

Llms

[2602.00462] LatentLens: Revealing Highly Interpretable Visual Tokens in LLMs

The paper introduces LatentLens, a method for mapping visual tokens to natural language descriptions in Vision-Language Models (VLMs), en...

arXiv - AI · 4 min · about 1 month ago

Computer Vision

[2601.08026] FigEx2: Visual-Conditioned Panel Detection and Captioning for Scientific Compound Figures

The paper presents FigEx2, a framework for detecting and captioning panels in scientific compound figures, enhancing understanding and ac...

arXiv - AI · 4 min · about 1 month ago

Nlp

[2512.08639] Aerial Vision-Language Navigation with a Unified Framework for Spatial, Temporal and Embodied Reasoning

This article presents a unified framework for Aerial Vision-Language Navigation (VLN), enabling UAVs to interpret natural language and na...

arXiv - AI · 4 min · about 1 month ago

Previous Page 13 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Computer Vision

Top This Week

[2506.22504] Patch2Loc: Learning to Localize Patches for Unsupervised Brain Lesion Detection

[2508.00307] Acoustic Imaging for Low-SNR UAV Detection: Dense Beamformed Energy Maps and U-Net SELD

[2603.25524] CHIRP dataset: towards long-term, individual-level, behavioral monitoring of bird populations in the wild

All Content

Niantic: Bringing spatial intelligence to the industrial edge

I geolocated a blurry pic from the Paris protests down to the exact coordinates using AI

[2509.11791] Synthetic vs. Real Training Data for Visual Navigation

[2509.07477] MedicalPatchNet: A Patch-Based Self-Explainable AI Architecture for Chest X-ray Classification

[2601.02439] WebGym: Scaling Training Environments for Visual Web Agents with Realistic Tasks

[2511.15487] NTK-Guided Implicit Neural Teaching

[2501.16443] Object-Centric World Models from Few-Shot Annotations for Sample-Efficient Reinforcement Learning

[2602.21873] GFPL: Generative Federated Prototype Learning for Resource-Constrained and Data-Imbalanced Vision Task

[2602.21707] Learning spatially adaptive sparsity level maps for arbitrary convolutional dictionaries

[2602.21703] Brain Tumor Segmentation with Special Emphasis on the Non-Enhancing Brain Tumor Compartment

[2602.21428] PSF-Med: Measuring and Explaining Paraphrase Sensitivity in Medical Vision Language Models

[2602.21397] MMLoP: Multi-Modal Low-Rank Prompting for Efficient Vision-Language Adaptation

[2602.21961] Robustness in sparse artificial neural networks trained with adaptive topology

[2602.21601] Deep Clustering based Boundary-Decoder Net for Inter and Intra Layer Stress Prediction of Heterogeneous Integrated IC Chip

[2602.21593] Breaking Semantic-Aware Watermarks via LLM-Guided Coherence-Preserving Semantic Injection

[2602.10359] Beyond Calibration: Confounding Pathology Limits Foundation Model Specificity in Abdominal Trauma CT

[2602.09929] Monocular Normal Estimation via Shading Sequence Estimation

[2602.00462] LatentLens: Revealing Highly Interpretable Visual Tokens in LLMs

[2601.08026] FigEx2: Visual-Conditioned Panel Detection and Captioning for Scientific Compound Figures

[2512.08639] Aerial Vision-Language Navigation with a Unified Framework for Spatial, Temporal and Embodied Reasoning

Related Topics

Stay updated with AI News