Computer Vision

Image recognition, detection, and visual AI

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Computer Vision

[2602.09678] Administrative Law's Fourth Settlement: AI and the Capability-Accountability Trap

Abstract page for arXiv paper 2602.09678: Administrative Law's Fourth Settlement: AI and the Capability-Accountability Trap

arXiv - AI · 4 min · about 11 hours ago

Llms

[2601.13622] CARPE: Context-Aware Image Representation Prioritization via Ensemble for Large Vision-Language Models

Abstract page for arXiv paper 2601.13622: CARPE: Context-Aware Image Representation Prioritization via Ensemble for Large Vision-Language...

arXiv - AI · 3 min · about 11 hours ago

Computer Vision

[2603.26551] Beyond MACs: Hardware Efficient Architecture Design for Vision Backbones

Abstract page for arXiv paper 2603.26551: Beyond MACs: Hardware Efficient Architecture Design for Vision Backbones

arXiv - AI · 4 min · about 11 hours ago

All Content

Machine Learning

[2505.17064] Synthetic History: Evaluating Visual Representations of the Past in Diffusion Models

This article evaluates how Text-to-Image diffusion models represent historical contexts, introducing a benchmark to assess their accuracy...

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2503.16021] Imitating AI agents increase diversity in homogeneous information environments but can reduce it in heterogeneous ones

This article explores how AI agents imitating human content affect information diversity, revealing context-dependent outcomes in homogen...

arXiv - AI · 4 min · about 1 month ago

Machine Learning

[2505.17748] Soft-CAM: Making black box models self-explainable for medical image analysis

The paper introduces Soft-CAM, a method that enhances the interpretability of convolutional neural networks (CNNs) in medical image analy...

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2505.11409] Visual Planning: Let's Think Only with Images

The paper introduces 'Visual Planning', a new paradigm that utilizes images for reasoning in spatial tasks, enhancing planning capabiliti...

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2412.13897] Data-Efficient Inference of Neural Fluid Fields via SciML Foundation Model

This article presents a novel approach to data-efficient inference of neural fluid fields using SciML foundation models, demonstrating si...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[1803.09319] SUNLayer: Stable denoising with generative networks

The paper introduces SUNLayer, a theoretical framework for stable denoising using generative networks, focusing on activation functions a...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2602.18406] Latent Equivariant Operators for Robust Object Recognition: Promise and Challenges

The paper discusses Latent Equivariant Operators as a novel approach to enhance object recognition in computer vision, addressing challen...

arXiv - Machine Learning · 3 min · about 1 month ago

Ai Infrastructure

[2411.08875] Causal Explanations for Image Classifiers

This paper presents a novel approach to generating causal explanations for image classifiers, introducing a black-box algorithm grounded ...

arXiv - AI · 3 min · about 1 month ago

Machine Learning

[2602.18377] Theory and interpretability of Quantum Extreme Learning Machines: a Pauli-transfer matrix approach

This article presents a theoretical analysis of Quantum Extreme Learning Machines (QELMs) using the Pauli-transfer matrix approach, highl...

arXiv - Machine Learning · 4 min · about 1 month ago

Computer Vision

[2602.18350] Quantum-enhanced satellite image classification

This paper presents a quantum feature extraction method that enhances multi-class image classification for satellite applications, achiev...

arXiv - Machine Learning · 3 min · about 1 month ago

Robotics

[2602.18374] Zero-shot Interactive Perception

The paper presents Zero-Shot Interactive Perception (ZS-IP), a framework that enhances robotic manipulation through a memory-driven Visio...

arXiv - AI · 3 min · about 1 month ago

Machine Learning

[2602.18252] On the Adversarial Robustness of Discrete Image Tokenizers

This paper investigates the adversarial robustness of discrete image tokenizers, highlighting their vulnerabilities and proposing a novel...

arXiv - AI · 3 min · about 1 month ago

Machine Learning

[2602.18083] Comparative Assessment of Multimodal Earth Observation Data for Soil Moisture Estimation

This article presents a high-resolution framework for soil moisture estimation using multimodal Earth observation data, highlighting the ...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.18047] CityGuard: Graph-Aware Private Descriptors for Bias-Resilient Identity Search Across Urban Cameras

CityGuard introduces a novel framework for privacy-preserving identity retrieval across urban surveillance cameras, addressing challenges...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.17929] ZACH-ViT: Regime-Dependent Inductive Bias in Compact Vision Transformers for Medical Imaging

ZACH-ViT introduces a novel Vision Transformer architecture tailored for medical imaging, enhancing performance by removing fixed spatial...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.18119] RamanSeg: Interpretability-driven Deep Learning on Raman Spectra for Cancer Diagnosis

The paper presents RamanSeg, an interpretable deep learning model for analyzing Raman spectra in cancer diagnosis, achieving significant ...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2602.17855] TopoGate: Quality-Aware Topology-Stabilized Gated Fusion for Longitudinal Low-Dose CT New-Lesion Prediction

The paper presents TopoGate, a model designed to enhance new-lesion prediction in longitudinal low-dose CT scans by integrating quality-a...

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2602.18094] OODBench: Out-of-Distribution Benchmark for Large Vision-Language Models

The paper introduces OODBench, a benchmark for evaluating large vision-language models' performance on out-of-distribution (OOD) data, hi...

arXiv - AI · 4 min · about 1 month ago

Data Science

[2602.18089] DohaScript: A Large-Scale Multi-Writer Dataset for Continuous Handwritten Hindi Text

DohaScript introduces a large-scale dataset for continuous handwritten Hindi text, addressing the lack of diverse and high-quality resour...

arXiv - Machine Learning · 4 min · about 1 month ago

Nlp

[2602.17814] VQPP: Video Query Performance Prediction Benchmark

The paper introduces the Video Query Performance Prediction (VQPP) benchmark, addressing a gap in query performance prediction for video ...

arXiv - Machine Learning · 4 min · about 1 month ago

Previous Page 27 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Computer Vision

Top This Week

[2602.09678] Administrative Law's Fourth Settlement: AI and the Capability-Accountability Trap

[2601.13622] CARPE: Context-Aware Image Representation Prioritization via Ensemble for Large Vision-Language Models

[2603.26551] Beyond MACs: Hardware Efficient Architecture Design for Vision Backbones

All Content

[2505.17064] Synthetic History: Evaluating Visual Representations of the Past in Diffusion Models

[2503.16021] Imitating AI agents increase diversity in homogeneous information environments but can reduce it in heterogeneous ones

[2505.17748] Soft-CAM: Making black box models self-explainable for medical image analysis

[2505.11409] Visual Planning: Let's Think Only with Images

[2412.13897] Data-Efficient Inference of Neural Fluid Fields via SciML Foundation Model

[1803.09319] SUNLayer: Stable denoising with generative networks

[2602.18406] Latent Equivariant Operators for Robust Object Recognition: Promise and Challenges

[2411.08875] Causal Explanations for Image Classifiers

[2602.18377] Theory and interpretability of Quantum Extreme Learning Machines: a Pauli-transfer matrix approach

[2602.18350] Quantum-enhanced satellite image classification

[2602.18374] Zero-shot Interactive Perception

[2602.18252] On the Adversarial Robustness of Discrete Image Tokenizers

[2602.18083] Comparative Assessment of Multimodal Earth Observation Data for Soil Moisture Estimation

[2602.18047] CityGuard: Graph-Aware Private Descriptors for Bias-Resilient Identity Search Across Urban Cameras

[2602.17929] ZACH-ViT: Regime-Dependent Inductive Bias in Compact Vision Transformers for Medical Imaging

[2602.18119] RamanSeg: Interpretability-driven Deep Learning on Raman Spectra for Cancer Diagnosis

[2602.17855] TopoGate: Quality-Aware Topology-Stabilized Gated Fusion for Longitudinal Low-Dose CT New-Lesion Prediction

[2602.18094] OODBench: Out-of-Distribution Benchmark for Large Vision-Language Models

[2602.18089] DohaScript: A Large-Scale Multi-Writer Dataset for Continuous Handwritten Hindi Text

[2602.17814] VQPP: Video Query Performance Prediction Benchmark

Related Topics

Stay updated with AI News