[2604.02032] IndoorCrowd: A Multi-Scene Dataset for Human Detection,

[2604.02032] IndoorCrowd: A Multi-Scene Dataset for Human Detection, Segmentation, and Tracking with an Automated Annotation Pipeline

arXiv - Machine Learning April 03, 2026 3 min read

About this article

Abstract page for arXiv paper 2604.02032: IndoorCrowd: A Multi-Scene Dataset for Human Detection, Segmentation, and Tracking with an Automated Annotation Pipeline

Computer Science > Computer Vision and Pattern Recognition arXiv:2604.02032 (cs) [Submitted on 2 Apr 2026] Title:IndoorCrowd: A Multi-Scene Dataset for Human Detection, Segmentation, and Tracking with an Automated Annotation Pipeline Authors:Sebastian-Ion Nae, Radu Moldoveanu, Alexandra Stefania Ghita, Adina Magda Florea View a PDF of the paper titled IndoorCrowd: A Multi-Scene Dataset for Human Detection, Segmentation, and Tracking with an Automated Annotation Pipeline, by Sebastian-Ion Nae and 3 other authors View PDF HTML (experimental) Abstract:Understanding human behaviour in crowded indoor environments is central to surveillance, smart buildings, and human-robot interaction, yet existing datasets rarely capture real-world indoor complexity at scale. We introduce IndoorCrowd, a multi-scene dataset for indoor human detection, instance segmentation, and multi-object tracking, collected across four campus locations (ACS-EC, ACS-EG, IE-Central, R-Central). It comprises $31$ videos ($9{,}913$ frames at $5$fps) with human-verified, per-instance segmentation masks. A $620$-frame control subset benchmarks three foundation-model auto-annotators: SAM3, GroundingSAM, and EfficientGroundingSAM, against human labels using Cohen's $\kappa$, AP, precision, recall, and mask IoU. A further $2{,}552$-frame subset supports multi-object tracking with continuous identity tracks in MOTChallenge format. We establish detection, segmentation, and tracking baselines using YOLOv8n, YOLOv26n, an...

Originally published on April 03, 2026. Curated by AI News.

Llms

[2507.14221] Fair Representation in Parliamentary Summaries: Measuring and Mitigating Inclusion Bias

Abstract page for arXiv paper 2507.14221: Fair Representation in Parliamentary Summaries: Measuring and Mitigating Inclusion Bias

arXiv - Machine Learning · 4 min · about 5 hours ago

Machine Learning

[D] On-Device Real-Time Visibility Restoration: Deterministic CV vs. Quantized ML Models. Looking for insights on Edge Preservation vs. Latency.

Hey everyone, We have been working on a real-time camera engine for iOS that currently uses a purely deterministic Computer Vision approa...

Reddit - Machine Learning · 1 min · about 14 hours ago

Machine Learning

[2512.02413] Enhancing Floor Plan Recognition: A Hybrid Mix-Transformer and U-Net Approach for Precise Wall Segmentation

Abstract page for arXiv paper 2512.02413: Enhancing Floor Plan Recognition: A Hybrid Mix-Transformer and U-Net Approach for Precise Wall ...

arXiv - AI · 4 min · 1 day ago

Computer Vision

[2511.14702] Seeing Beyond the Image: ECG and Anatomical Knowledge-Guided Myocardial Scar Segmentation from Late Gadolinium-Enhanced Images

Abstract page for arXiv paper 2511.14702: Seeing Beyond the Image: ECG and Anatomical Knowledge-Guided Myocardial Scar Segmentation from ...