Computer Vision

Image recognition, detection, and visual AI

Top This Week

[2506.22504] Patch2Loc: Learning to Localize Patches for Unsupervised Brain Lesion Detection
Machine Learning

[2506.22504] Patch2Loc: Learning to Localize Patches for Unsupervised Brain Lesion Detection

Abstract page for arXiv paper 2506.22504: Patch2Loc: Learning to Localize Patches for Unsupervised Brain Lesion Detection

arXiv - Machine Learning · 4 min ·
[2508.00307] Acoustic Imaging for Low-SNR UAV Detection: Dense Beamformed Energy Maps and U-Net SELD
Machine Learning

[2508.00307] Acoustic Imaging for Low-SNR UAV Detection: Dense Beamformed Energy Maps and U-Net SELD

Abstract page for arXiv paper 2508.00307: Acoustic Imaging for Low-SNR UAV Detection: Dense Beamformed Energy Maps and U-Net SELD

arXiv - AI · 4 min ·
[2603.25524] CHIRP dataset: towards long-term, individual-level, behavioral monitoring of bird populations in the wild
Computer Vision

[2603.25524] CHIRP dataset: towards long-term, individual-level, behavioral monitoring of bird populations in the wild

Abstract page for arXiv paper 2603.25524: CHIRP dataset: towards long-term, individual-level, behavioral monitoring of bird populations i...

arXiv - AI · 4 min ·

All Content

[2602.21655] CCCaption: Dual-Reward Reinforcement Learning for Complete and Correct Image Captioning
Machine Learning

[2602.21655] CCCaption: Dual-Reward Reinforcement Learning for Complete and Correct Image Captioning

The paper introduces CCCaption, a dual-reward reinforcement learning framework designed to enhance image captioning by optimizing for com...

arXiv - AI · 4 min ·
[2602.21613] Virtual Biopsy for Intracranial Tumors Diagnosis on MRI
Ai Safety

[2602.21613] Virtual Biopsy for Intracranial Tumors Diagnosis on MRI

This article presents a novel Virtual Biopsy framework for diagnosing intracranial tumors using MRI, addressing the challenges of traditi...

arXiv - AI · 4 min ·
[2602.21476] A Knowledge-Driven Approach to Music Segmentation, Music Source Separation and Cinematic Audio Source Separation
Machine Learning

[2602.21476] A Knowledge-Driven Approach to Music Segmentation, Music Source Separation and Cinematic Audio Source Separation

This paper presents a knowledge-driven approach for audio segmentation and source separation, utilizing music scores and model-based tech...

arXiv - Machine Learning · 4 min ·
[2602.21452] Adversarial Robustness of Deep Learning-Based Thyroid Nodule Segmentation in Ultrasound
Machine Learning

[2602.21452] Adversarial Robustness of Deep Learning-Based Thyroid Nodule Segmentation in Ultrasound

This article evaluates the adversarial robustness of deep learning models for thyroid nodule segmentation in ultrasound images, highlight...

arXiv - AI · 4 min ·
[2602.21441] Causal Decoding for Hallucination-Resistant Multimodal Large Language Models
Llms

[2602.21441] Causal Decoding for Hallucination-Resistant Multimodal Large Language Models

This article presents a novel causal decoding framework aimed at reducing object hallucination in multimodal large language models (MLLMs...

arXiv - Machine Learning · 3 min ·
[2602.21421] ECHOSAT: Estimating Canopy Height Over Space And Time
Machine Learning

[2602.21421] ECHOSAT: Estimating Canopy Height Over Space And Time

ECHOSAT introduces a global tree height map that captures temporal forest dynamics, enhancing carbon monitoring and disturbance assessmen...

arXiv - Machine Learning · 4 min ·
[2602.21372] The Mean is the Mirage: Entropy-Adaptive Model Merging under Heterogeneous Domain Shifts in Medical Imaging
Machine Learning

[2602.21372] The Mean is the Mirage: Entropy-Adaptive Model Merging under Heterogeneous Domain Shifts in Medical Imaging

This article presents an entropy-adaptive model merging technique for medical imaging that addresses challenges posed by heterogeneous do...

arXiv - Machine Learning · 4 min ·
[2602.21365] Towards Controllable Video Synthesis of Routine and Rare OR Events
Generative Ai

[2602.21365] Towards Controllable Video Synthesis of Routine and Rare OR Events

The paper presents a novel framework for synthesizing controlled video representations of routine and rare operating room events, address...

arXiv - Machine Learning · 4 min ·
[2602.21361] Towards single-shot coherent imaging via overlap-free ptychography
Computer Vision

[2602.21361] Towards single-shot coherent imaging via overlap-free ptychography

The paper presents a novel approach to single-shot coherent imaging using overlap-free ptychography, enhancing throughput and reducing do...

arXiv - Machine Learning · 4 min ·
[2602.21341] Scaling View Synthesis Transformers
Machine Learning

[2602.21341] Scaling View Synthesis Transformers

The paper explores scaling laws for view synthesis transformers, presenting a new architecture that outperforms previous models in Novel ...

arXiv - AI · 3 min ·
Adobe’s new AI video editing tool stitches clips into a first draft | The Verge
Ai Startups

Adobe’s new AI video editing tool stitches clips into a first draft | The Verge

Adobe introduces Quick Cut, an AI tool that automates the initial video editing process, allowing creators to focus on storytelling by ge...

The Verge - AI · 4 min ·
Nlp

[D] : We ran MobileNetV2 on a Snapdragon 8 Gen 3 100 times — 83% latency spread, 7x cold-start penalty. Here's the raw data.

This article presents performance metrics of MobileNetV2 running on a Snapdragon 8 Gen 3, revealing significant latency variations and co...

Reddit - Machine Learning · 1 min ·
[2602.04819] XtraLight-MedMamba for Classification of Neoplastic Tubular Adenomas
Machine Learning

[2602.04819] XtraLight-MedMamba for Classification of Neoplastic Tubular Adenomas

The article presents XtraLight-MedMamba, a deep learning framework designed for the classification of neoplastic tubular adenomas, achiev...

arXiv - Machine Learning · 4 min ·
[2507.19575] Is Exchangeability better than I.I.D to handle Data Distribution Shifts while Pooling Data for Data-scarce Medical image segmentation?
Machine Learning

[2507.19575] Is Exchangeability better than I.I.D to handle Data Distribution Shifts while Pooling Data for Data-scarce Medical image segmentation?

This paper explores the effectiveness of using exchangeability over the traditional i.i.d. assumption in addressing data distribution shi...

arXiv - Machine Learning · 4 min ·
[2505.06646] Reproducing and Improving CheXNet: Deep Learning for Chest X-ray Disease Classification
Machine Learning

[2505.06646] Reproducing and Improving CheXNet: Deep Learning for Chest X-ray Disease Classification

This article discusses the reproduction and enhancement of CheXNet, a deep learning model for classifying chest X-ray diseases, using the...

arXiv - Machine Learning · 3 min ·
[2503.06437] SEED: Towards More Accurate Semantic Evaluation for Visual Brain Decoding
Machine Learning

[2503.06437] SEED: Towards More Accurate Semantic Evaluation for Visual Brain Decoding

The SEED metric enhances semantic evaluation in visual brain decoding by integrating multiple metrics, revealing limitations in existing ...

arXiv - Machine Learning · 3 min ·
[2407.12506] Classification and reconstruction for single-pixel imaging with classical and quantum neural networks
Machine Learning

[2407.12506] Classification and reconstruction for single-pixel imaging with classical and quantum neural networks

This article explores the use of classical and quantum neural networks for single-pixel imaging, demonstrating effective classification a...

arXiv - Machine Learning · 4 min ·
[2508.01115] A hierarchy tree data structure for behavior-based user segment representation
Computer Vision

[2508.01115] A hierarchy tree data structure for behavior-based user segment representation

This paper introduces a novel hierarchy tree data structure for behavior-based user segmentation, enhancing recommendation systems by add...

arXiv - Machine Learning · 4 min ·
[2505.13289] RECON: Robust symmetry discovery via Explicit Canonical Orientation Normalization
Machine Learning

[2505.13289] RECON: Robust symmetry discovery via Explicit Canonical Orientation Normalization

The paper introduces RECON, a method for robust symmetry discovery through Explicit Canonical Orientation Normalization, enhancing data r...

arXiv - Machine Learning · 3 min ·
[2401.07390] Knee or ROC
Machine Learning

[2401.07390] Knee or ROC

The paper 'Knee or ROC' explores accuracy measurement methods for multi-class image detection using self-attention transformers, proposin...

arXiv - Machine Learning · 3 min ·
Previous Page 15 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime