Computer Vision

Image recognition, detection, and visual AI

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Computer Vision

[2602.09678] Administrative Law's Fourth Settlement: AI and the Capability-Accountability Trap

Abstract page for arXiv paper 2602.09678: Administrative Law's Fourth Settlement: AI and the Capability-Accountability Trap

arXiv - AI · 4 min · about 13 hours ago

Llms

[2601.13622] CARPE: Context-Aware Image Representation Prioritization via Ensemble for Large Vision-Language Models

Abstract page for arXiv paper 2601.13622: CARPE: Context-Aware Image Representation Prioritization via Ensemble for Large Vision-Language...

arXiv - AI · 3 min · about 13 hours ago

Computer Vision

[2603.26551] Beyond MACs: Hardware Efficient Architecture Design for Vision Backbones

Abstract page for arXiv paper 2603.26551: Beyond MACs: Hardware Efficient Architecture Design for Vision Backbones

arXiv - AI · 4 min · about 13 hours ago

All Content

Machine Learning

[2602.18022] Dual-Channel Attention Guidance for Training-Free Image Editing Control in Diffusion Transformers

This paper introduces Dual-Channel Attention Guidance (DCAG), a novel training-free method for enhancing image editing control in Diffusi...

arXiv - AI · 4 min · about 1 month ago

Computer Vision

[2602.18019] DeepSVU: Towards In-depth Security-oriented Video Understanding via Unified Physical-world Regularized MoE

The paper introduces DeepSVU, a novel approach for Security-oriented Video Understanding that identifies threats and evaluates their caus...

arXiv - AI · 4 min · about 1 month ago

Llms

[2602.17770] CLUTCH: Contextualized Language model for Unlocking Text-Conditioned Hand motion modelling in the wild

The paper introduces CLUTCH, a novel model for generating hand motions from text, leveraging a new dataset and advanced techniques to imp...

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2602.17951] ROCKET: Residual-Oriented Multi-Layer Alignment for Spatially-Aware Vision-Language-Action Models

The paper presents ROCKET, a novel framework for enhancing Vision-Language-Action models by employing residual-oriented multi-layer align...

arXiv - AI · 4 min · about 1 month ago

Machine Learning

[2602.18428] The Geometry of Noise: Why Diffusion Models Don't Need Noise Conditioning

This paper explores the concept of noise-agnostic generative models, specifically diffusion models, and argues that they do not require e...

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2602.17871] Understanding the Fine-Grained Knowledge Capabilities of Vision-Language Models

This paper explores the fine-grained knowledge capabilities of vision-language models (VLMs), highlighting their performance on visual qu...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2602.17797] Deep Learning for Dermatology: An Innovative Framework for Approaching Precise Skin Cancer Detection

This article presents a deep learning framework for improving skin cancer detection using VGG16 and DenseNet201 models, achieving an accu...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.17749] Detection and Classification of Cetacean Echolocation Clicks using Image-based Object Detection Methods applied to Advanced Wavelet-based Transformations

This paper explores the use of advanced wavelet transformations and image-based object detection methods to improve the detection and cla...

arXiv - AI · 4 min · about 1 month ago

Generative Ai

[2602.17690] DesignAsCode: Bridging Structural Editability and Visual Fidelity in Graphic Design Generation

The paper presents DesignAsCode, a framework that enhances graphic design generation by integrating structural editability with visual fi...

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2602.17687] IRPAPERS: A Visual Document Benchmark for Scientific Retrieval and Question Answering

The paper introduces IRPAPERS, a benchmark for evaluating visual document retrieval and question answering, comparing image-based and tex...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.17853] Neural Prior Estimation: Learning Class Priors from Latent Representations

The paper introduces Neural Prior Estimator (NPE), a framework for learning class priors from latent representations, addressing class im...

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2602.17689] Robust Pre-Training of Medical Vision-and-Language Models with Domain-Invariant Multi-Modal Masked Reconstruction

This article presents Robust Multi-Modal Masked Reconstruction (Robust-MMR), a novel self-supervised pre-training framework for medical v...

arXiv - Machine Learning · 4 min · about 1 month ago

Nlp

[2602.17683] Probabilistic NDVI Forecasting from Sparse Satellite Time Series and Weather Covariates

This paper presents a probabilistic framework for forecasting NDVI from sparse satellite data and weather covariates, enhancing precision...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[R] CVPR results

This Reddit post invites discussion on CVPR acceptance results, encouraging users to share their experiences regarding scores and rebutta...

Reddit - Machine Learning · 1 min · about 1 month ago

Ai Safety

Cities Are Shredding Their AI Surveillance Contracts en Masse

Over 30 cities have terminated contracts with Flock Safety, an AI surveillance company, amid rising concerns over privacy and federal ove...

AI Tools & Products · 2 min · about 1 month ago

Ai Startups

Apple’s Next Big Thing Is a Push Into Visual Artificial Intelligence

Apple is reportedly making significant strides in visual artificial intelligence, with expectations for new product launches that integra...

AI Tools & Products · 1 min · about 1 month ago

Generative Ai

Fake faces generated by AI are now "too good to be true," researchers warn

Researchers highlight the increasing realism of AI-generated faces, warning that they may soon be indistinguishable from real images, rai...

Reddit - Artificial Intelligence · 1 min · about 1 month ago

Machine Learning

[D] CVPR Findings Track

The article discusses the CVPR Findings Track, a submission opportunity for rejected papers, and seeks guidance on the submission process.

Reddit - Machine Learning · 1 min · about 1 month ago

Machine Learning

[D] Questions regarding the new Findings track at CVPR 2026

A Reddit user discusses their experience with the new Findings track at CVPR 2026, highlighting challenges faced during the submission pr...

Reddit - Machine Learning · 1 min · about 1 month ago

Machine Learning

[D] How do PhD committees view a solo-author CVPR paper?

A Reddit user seeks insights on how PhD committees perceive solo-author papers accepted at CVPR, especially from candidates without major...

Reddit - Machine Learning · 1 min · about 1 month ago

Previous Page 28 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Computer Vision

Top This Week

[2602.09678] Administrative Law's Fourth Settlement: AI and the Capability-Accountability Trap

[2601.13622] CARPE: Context-Aware Image Representation Prioritization via Ensemble for Large Vision-Language Models

[2603.26551] Beyond MACs: Hardware Efficient Architecture Design for Vision Backbones

All Content

[2602.18022] Dual-Channel Attention Guidance for Training-Free Image Editing Control in Diffusion Transformers

[2602.18019] DeepSVU: Towards In-depth Security-oriented Video Understanding via Unified Physical-world Regularized MoE

[2602.17770] CLUTCH: Contextualized Language model for Unlocking Text-Conditioned Hand motion modelling in the wild

[2602.17951] ROCKET: Residual-Oriented Multi-Layer Alignment for Spatially-Aware Vision-Language-Action Models

[2602.18428] The Geometry of Noise: Why Diffusion Models Don't Need Noise Conditioning

[2602.17871] Understanding the Fine-Grained Knowledge Capabilities of Vision-Language Models

[2602.17797] Deep Learning for Dermatology: An Innovative Framework for Approaching Precise Skin Cancer Detection

[2602.17749] Detection and Classification of Cetacean Echolocation Clicks using Image-based Object Detection Methods applied to Advanced Wavelet-based Transformations

[2602.17690] DesignAsCode: Bridging Structural Editability and Visual Fidelity in Graphic Design Generation

[2602.17687] IRPAPERS: A Visual Document Benchmark for Scientific Retrieval and Question Answering

[2602.17853] Neural Prior Estimation: Learning Class Priors from Latent Representations

[2602.17689] Robust Pre-Training of Medical Vision-and-Language Models with Domain-Invariant Multi-Modal Masked Reconstruction

[2602.17683] Probabilistic NDVI Forecasting from Sparse Satellite Time Series and Weather Covariates

[R] CVPR results

Cities Are Shredding Their AI Surveillance Contracts en Masse

Apple’s Next Big Thing Is a Push Into Visual Artificial Intelligence

Fake faces generated by AI are now "too good to be true," researchers warn

[D] CVPR Findings Track

[D] Questions regarding the new Findings track at CVPR 2026

[D] How do PhD committees view a solo-author CVPR paper?

Related Topics

Stay updated with AI News