[2602.09678] Administrative Law's Fourth Settlement: AI and the Capability-Accountability Trap
Abstract page for arXiv paper 2602.09678: Administrative Law's Fourth Settlement: AI and the Capability-Accountability Trap
Image recognition, detection, and visual AI
Abstract page for arXiv paper 2602.09678: Administrative Law's Fourth Settlement: AI and the Capability-Accountability Trap
Abstract page for arXiv paper 2601.13622: CARPE: Context-Aware Image Representation Prioritization via Ensemble for Large Vision-Language...
Abstract page for arXiv paper 2603.26551: Beyond MACs: Hardware Efficient Architecture Design for Vision Backbones
The paper presents HistCAD, a comprehensive dataset for parametric CAD modeling that incorporates geometric constraints and functional se...
The paper explores how neural fields can serve as world models, preserving sensory topology for better prediction of physical outcomes, w...
This study presents a transformer-based framework for detecting fungal elements in dermatophytosis using KOH microscopy, achieving high a...
The paper presents a novel method for assessing and recalibrating probability predictions in multiclass classification tasks, addressing ...
This paper investigates how adversarial perturbations can induce hallucinations in generative models used for MRI reconstruction, highlig...
This paper evaluates the effectiveness of generative metrics in predicting the performance of YOLO object detection models across various...
This study evaluates feature disentanglement methods to mitigate shortcut learning in medical imaging, enhancing model robustness and cla...
The paper presents an adaptive multi-agent framework for improving text-to-video retrieval systems, addressing challenges in query-depend...
The paper presents FishProtoNet, a non-invasive computer vision framework for accurately identifying the sex of delta smelt, an endangere...
This article presents a replication study of the FedTPG model, which enhances vision-language model performance in federated learning sce...
SceneTok introduces a novel tokenizer that compresses 3D scene representations into a set of diffusable tokens, achieving superior compre...
The paper presents FOCA, a novel framework for detecting and localizing image forgery using a multi-modal large language model that integ...
This article presents the Structure-Level Disentangled Diffusion Model (SLD-Font) for few-shot Chinese font generation, enhancing style f...
This paper presents a novel tensor-based framework for Vision Transformers, enhancing computational efficiency while maintaining competit...
BiMotion introduces a novel approach to dynamic 3D character generation using B-spline curves, enhancing motion quality and alignment wit...
This article explores the use of diffusion models to enhance adversarial training for robust image classifiers, demonstrating improved pe...
The paper presents LA-LoRA, a novel approach for fine-tuning large models in privacy-preserving federated learning, addressing key challe...
The paper introduces TAG, a vision-language framework for Facial Expression Recognition (FER) that enhances reasoning by grounding predic...
The paper presents a novel pipeline for synthesizing multimodal geometry datasets, introducing the GeoCode dataset which enhances visual-...
The paper presents RoboCurate, a framework for generating synthetic robot data that enhances action quality through simulation replay and...
Get the latest news, tools, and insights delivered to your inbox.
Daily or weekly digest • Unsubscribe anytime