Computer Vision

Image recognition, detection, and visual AI

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Computer Vision

[2602.09678] Administrative Law's Fourth Settlement: AI and the Capability-Accountability Trap

Abstract page for arXiv paper 2602.09678: Administrative Law's Fourth Settlement: AI and the Capability-Accountability Trap

arXiv - AI · 4 min · about 15 hours ago

Llms

[2601.13622] CARPE: Context-Aware Image Representation Prioritization via Ensemble for Large Vision-Language Models

Abstract page for arXiv paper 2601.13622: CARPE: Context-Aware Image Representation Prioritization via Ensemble for Large Vision-Language...

arXiv - AI · 3 min · about 15 hours ago

Computer Vision

[2603.26551] Beyond MACs: Hardware Efficient Architecture Design for Vision Backbones

Abstract page for arXiv paper 2603.26551: Beyond MACs: Hardware Efficient Architecture Design for Vision Backbones

arXiv - AI · 4 min · about 15 hours ago

All Content

Machine Learning

[D] Submit to ECCV or opt in for CVPR findings?

The article discusses the dilemma of submitting a paper to ECCV or opting for CVPR Findings, highlighting confusion around the perception...

Reddit - Machine Learning · 1 min · about 1 month ago

Nlp

[R] Vision+Time Series data Encoder

The article discusses the need for a vision and time series data encoder, seeking recent research and pre-trained models for generating e...

Reddit - Machine Learning · 1 min · about 1 month ago

Ai Startups

TikTok creators’ Seedance 2.0 AI is hyperrealistic, arrived “seemingly out of nowhere,” and is spooking Hollywood

TikTok creators have unveiled Seedance 2.0 AI, a hyperrealistic technology that is causing concern in Hollywood due to its potential impa...

Reddit - Artificial Intelligence · 1 min · about 1 month ago

Llms

[R] Can Vision-Language Models See Squares? Text-Recognition Mediates Spatial Reasoning Across Three Model Families

This article discusses a study on Vision-Language Models (VLMs) that highlights their performance disparity in recognizing binary grids r...

Reddit - Machine Learning · 1 min · about 1 month ago

Llms

OpenAI’s first ChatGPT gadget could be a smart speaker with a camera | The Verge

OpenAI is reportedly developing its first hardware product, a smart speaker with a camera, alongside potential smart glasses and a lamp, ...

The Verge - AI · 3 min · about 1 month ago

Machine Learning

[2510.25166] A Study on Inference Latency for Vision Transformers on Mobile Devices

This study quantitatively analyzes the inference latency of 190 vision transformers (ViTs) on mobile devices, comparing them to 102 convo...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2510.12581] LayerSync: Self-aligning Intermediate Layers

LayerSync introduces a novel approach to enhance diffusion models by self-aligning intermediate layers, improving training efficiency and...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2505.11235] Efficient Orthogonal Fine-Tuning with Principal Subspace Adaptation

The paper presents Efficient Orthogonal Fine-Tuning with Principal Subspace Adaptation (PSOFT), a method that enhances parameter-efficien...

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2602.16979] Characterizing the Predictive Impact of Modalities with Supervised Latent-Variable Modeling

The paper presents PRIMO, a supervised latent-variable model that addresses the challenges of incomplete multimodal data by quantifying t...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.16749] U-FedTomAtt: Ultra-lightweight Federated Learning with Attention for Tomato Disease Recognition

The paper presents U-FedTomAtt, an ultra-lightweight federated learning framework designed for tomato disease recognition, optimizing per...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.17642] A.R.I.S.: Automated Recycling Identification System for E-Waste Classification Using Deep Learning

The A.R.I.S. system utilizes deep learning to enhance e-waste recycling by accurately classifying materials in real-time, improving recov...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2602.17350] Shortcut learning in geometric knot classification

This paper explores the application of machine learning to classify geometric knots, addressing the challenge of identifying equivalent e...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.17321] The Sound of Death: Deep Learning Reveals Vascular Damage from Carotid Ultrasound

This article presents a machine learning framework that analyzes carotid ultrasound videos to identify vascular damage, enhancing early d...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.17270] Unified Latents (UL): How to train your latents

The paper introduces Unified Latents (UL), a framework for training latent representations using a diffusion prior, achieving competitive...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2602.17117] i-PhysGaussian: Implicit Physical Simulation for 3D Gaussian Splatting

The paper introduces i-PhysGaussian, a novel framework for implicit physical simulation that enhances 3D Gaussian Splatting by minimizing...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2602.06355] Di3PO - Diptych Diffusion DPO for Targeted Improvements in Image Generation

The paper presents Di3PO, a novel method for improving image generation in text-to-image diffusion models by efficiently creating targete...

arXiv - AI · 3 min · about 1 month ago

Machine Learning

[2601.01224] Improved Object-Centric Diffusion Learning with Registers and Contrastive Alignment

This paper presents Contrastive Object-centric Diffusion Alignment (CODA), an enhancement to object-centric learning that reduces slot en...

arXiv - AI · 4 min · about 1 month ago

Machine Learning

[2512.19941] Block-Recurrent Dynamics in Vision Transformers

This article introduces the Block-Recurrent Hypothesis (BRH) for Vision Transformers, proposing a new framework for understanding their c...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2512.07984] Restrictive Hierarchical Semantic Segmentation for Stratified Tooth Layer Detection

This article presents a novel framework for hierarchical semantic segmentation aimed at improving the detection of stratified tooth layer...

arXiv - AI · 4 min · about 1 month ago

Computer Vision

[2511.14654] Improving segmentation of retinal arteries and veins using cardiac signal in doppler holograms

This article presents a novel approach to segmenting retinal arteries and veins using cardiac signals in Doppler holograms, enhancing tra...

arXiv - AI · 3 min · about 1 month ago

Previous Page 29 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Computer Vision

Top This Week

[2602.09678] Administrative Law's Fourth Settlement: AI and the Capability-Accountability Trap

[2601.13622] CARPE: Context-Aware Image Representation Prioritization via Ensemble for Large Vision-Language Models

[2603.26551] Beyond MACs: Hardware Efficient Architecture Design for Vision Backbones

All Content

[D] Submit to ECCV or opt in for CVPR findings?

[R] Vision+Time Series data Encoder

TikTok creators’ Seedance 2.0 AI is hyperrealistic, arrived “seemingly out of nowhere,” and is spooking Hollywood

[R] Can Vision-Language Models See Squares? Text-Recognition Mediates Spatial Reasoning Across Three Model Families

OpenAI’s first ChatGPT gadget could be a smart speaker with a camera | The Verge

[2510.25166] A Study on Inference Latency for Vision Transformers on Mobile Devices

[2510.12581] LayerSync: Self-aligning Intermediate Layers

[2505.11235] Efficient Orthogonal Fine-Tuning with Principal Subspace Adaptation

[2602.16979] Characterizing the Predictive Impact of Modalities with Supervised Latent-Variable Modeling

[2602.16749] U-FedTomAtt: Ultra-lightweight Federated Learning with Attention for Tomato Disease Recognition

[2602.17642] A.R.I.S.: Automated Recycling Identification System for E-Waste Classification Using Deep Learning

[2602.17350] Shortcut learning in geometric knot classification

[2602.17321] The Sound of Death: Deep Learning Reveals Vascular Damage from Carotid Ultrasound

[2602.17270] Unified Latents (UL): How to train your latents

[2602.17117] i-PhysGaussian: Implicit Physical Simulation for 3D Gaussian Splatting

[2602.06355] Di3PO - Diptych Diffusion DPO for Targeted Improvements in Image Generation

[2601.01224] Improved Object-Centric Diffusion Learning with Registers and Contrastive Alignment

[2512.19941] Block-Recurrent Dynamics in Vision Transformers

[2512.07984] Restrictive Hierarchical Semantic Segmentation for Stratified Tooth Layer Detection

[2511.14654] Improving segmentation of retinal arteries and veins using cardiac signal in doppler holograms

Related Topics

Stay updated with AI News