Computer Vision

Image recognition, detection, and visual AI

Top This Week

[2602.09678] Administrative Law's Fourth Settlement: AI and the Capability-Accountability Trap
Computer Vision

[2602.09678] Administrative Law's Fourth Settlement: AI and the Capability-Accountability Trap

Abstract page for arXiv paper 2602.09678: Administrative Law's Fourth Settlement: AI and the Capability-Accountability Trap

arXiv - AI · 4 min ·
[2601.13622] CARPE: Context-Aware Image Representation Prioritization via Ensemble for Large Vision-Language Models
Llms

[2601.13622] CARPE: Context-Aware Image Representation Prioritization via Ensemble for Large Vision-Language Models

Abstract page for arXiv paper 2601.13622: CARPE: Context-Aware Image Representation Prioritization via Ensemble for Large Vision-Language...

arXiv - AI · 3 min ·
[2603.26551] Beyond MACs: Hardware Efficient Architecture Design for Vision Backbones
Computer Vision

[2603.26551] Beyond MACs: Hardware Efficient Architecture Design for Vision Backbones

Abstract page for arXiv paper 2603.26551: Beyond MACs: Hardware Efficient Architecture Design for Vision Backbones

arXiv - AI · 4 min ·

All Content

[2602.18022] Dual-Channel Attention Guidance for Training-Free Image Editing Control in Diffusion Transformers
Machine Learning

[2602.18022] Dual-Channel Attention Guidance for Training-Free Image Editing Control in Diffusion Transformers

This paper introduces Dual-Channel Attention Guidance (DCAG), a novel training-free method for enhancing image editing control in Diffusi...

arXiv - AI · 4 min ·
[2602.18019] DeepSVU: Towards In-depth Security-oriented Video Understanding via Unified Physical-world Regularized MoE
Computer Vision

[2602.18019] DeepSVU: Towards In-depth Security-oriented Video Understanding via Unified Physical-world Regularized MoE

The paper introduces DeepSVU, a novel approach for Security-oriented Video Understanding that identifies threats and evaluates their caus...

arXiv - AI · 4 min ·
[2602.17770] CLUTCH: Contextualized Language model for Unlocking Text-Conditioned Hand motion modelling in the wild
Llms

[2602.17770] CLUTCH: Contextualized Language model for Unlocking Text-Conditioned Hand motion modelling in the wild

The paper introduces CLUTCH, a novel model for generating hand motions from text, leveraging a new dataset and advanced techniques to imp...

arXiv - Machine Learning · 4 min ·
[2602.17951] ROCKET: Residual-Oriented Multi-Layer Alignment for Spatially-Aware Vision-Language-Action Models
Llms

[2602.17951] ROCKET: Residual-Oriented Multi-Layer Alignment for Spatially-Aware Vision-Language-Action Models

The paper presents ROCKET, a novel framework for enhancing Vision-Language-Action models by employing residual-oriented multi-layer align...

arXiv - AI · 4 min ·
[2602.18428] The Geometry of Noise: Why Diffusion Models Don't Need Noise Conditioning
Machine Learning

[2602.18428] The Geometry of Noise: Why Diffusion Models Don't Need Noise Conditioning

This paper explores the concept of noise-agnostic generative models, specifically diffusion models, and argues that they do not require e...

arXiv - Machine Learning · 4 min ·
[2602.17871] Understanding the Fine-Grained Knowledge Capabilities of Vision-Language Models
Llms

[2602.17871] Understanding the Fine-Grained Knowledge Capabilities of Vision-Language Models

This paper explores the fine-grained knowledge capabilities of vision-language models (VLMs), highlighting their performance on visual qu...

arXiv - Machine Learning · 3 min ·
[2602.17797] Deep Learning for Dermatology: An Innovative Framework for Approaching Precise Skin Cancer Detection
Machine Learning

[2602.17797] Deep Learning for Dermatology: An Innovative Framework for Approaching Precise Skin Cancer Detection

This article presents a deep learning framework for improving skin cancer detection using VGG16 and DenseNet201 models, achieving an accu...

arXiv - Machine Learning · 4 min ·
[2602.17749] Detection and Classification of Cetacean Echolocation Clicks using Image-based Object Detection Methods applied to Advanced Wavelet-based Transformations
Machine Learning

[2602.17749] Detection and Classification of Cetacean Echolocation Clicks using Image-based Object Detection Methods applied to Advanced Wavelet-based Transformations

This paper explores the use of advanced wavelet transformations and image-based object detection methods to improve the detection and cla...

arXiv - AI · 4 min ·
[2602.17690] DesignAsCode: Bridging Structural Editability and Visual Fidelity in Graphic Design Generation
Generative Ai

[2602.17690] DesignAsCode: Bridging Structural Editability and Visual Fidelity in Graphic Design Generation

The paper presents DesignAsCode, a framework that enhances graphic design generation by integrating structural editability with visual fi...

arXiv - Machine Learning · 3 min ·
[2602.17687] IRPAPERS: A Visual Document Benchmark for Scientific Retrieval and Question Answering
Llms

[2602.17687] IRPAPERS: A Visual Document Benchmark for Scientific Retrieval and Question Answering

The paper introduces IRPAPERS, a benchmark for evaluating visual document retrieval and question answering, comparing image-based and tex...

arXiv - Machine Learning · 4 min ·
[2602.17853] Neural Prior Estimation: Learning Class Priors from Latent Representations
Machine Learning

[2602.17853] Neural Prior Estimation: Learning Class Priors from Latent Representations

The paper introduces Neural Prior Estimator (NPE), a framework for learning class priors from latent representations, addressing class im...

arXiv - Machine Learning · 3 min ·
[2602.17689] Robust Pre-Training of Medical Vision-and-Language Models with Domain-Invariant Multi-Modal Masked Reconstruction
Llms

[2602.17689] Robust Pre-Training of Medical Vision-and-Language Models with Domain-Invariant Multi-Modal Masked Reconstruction

This article presents Robust Multi-Modal Masked Reconstruction (Robust-MMR), a novel self-supervised pre-training framework for medical v...

arXiv - Machine Learning · 4 min ·
[2602.17683] Probabilistic NDVI Forecasting from Sparse Satellite Time Series and Weather Covariates
Nlp

[2602.17683] Probabilistic NDVI Forecasting from Sparse Satellite Time Series and Weather Covariates

This paper presents a probabilistic framework for forecasting NDVI from sparse satellite data and weather covariates, enhancing precision...

arXiv - Machine Learning · 4 min ·
Machine Learning

[R] CVPR results

This Reddit post invites discussion on CVPR acceptance results, encouraging users to share their experiences regarding scores and rebutta...

Reddit - Machine Learning · 1 min ·
Cities Are Shredding Their AI Surveillance Contracts en Masse
Ai Safety

Cities Are Shredding Their AI Surveillance Contracts en Masse

Over 30 cities have terminated contracts with Flock Safety, an AI surveillance company, amid rising concerns over privacy and federal ove...

AI Tools & Products · 2 min ·
Ai Startups

Apple’s Next Big Thing Is a Push Into Visual Artificial Intelligence

Apple is reportedly making significant strides in visual artificial intelligence, with expectations for new product launches that integra...

AI Tools & Products · 1 min ·
Generative Ai

Fake faces generated by AI are now "too good to be true," researchers warn

Researchers highlight the increasing realism of AI-generated faces, warning that they may soon be indistinguishable from real images, rai...

Reddit - Artificial Intelligence · 1 min ·
Machine Learning

[D] CVPR Findings Track

The article discusses the CVPR Findings Track, a submission opportunity for rejected papers, and seeks guidance on the submission process.

Reddit - Machine Learning · 1 min ·
Machine Learning

[D] Questions regarding the new Findings track at CVPR 2026

A Reddit user discusses their experience with the new Findings track at CVPR 2026, highlighting challenges faced during the submission pr...

Reddit - Machine Learning · 1 min ·
Machine Learning

[D] How do PhD committees view a solo-author CVPR paper?

A Reddit user seeks insights on how PhD committees perceive solo-author papers accepted at CVPR, especially from candidates without major...

Reddit - Machine Learning · 1 min ·
Previous Page 28 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime