[2604.02032] IndoorCrowd: A Multi-Scene Dataset for Human Detection, Segmentation, and Tracking with an Automated Annotation Pipeline

[2604.02032] IndoorCrowd: A Multi-Scene Dataset for Human Detection, Segmentation, and Tracking with an Automated Annotation Pipeline

arXiv - Machine Learning 3 min read

About this article

Abstract page for arXiv paper 2604.02032: IndoorCrowd: A Multi-Scene Dataset for Human Detection, Segmentation, and Tracking with an Automated Annotation Pipeline

Computer Science > Computer Vision and Pattern Recognition arXiv:2604.02032 (cs) [Submitted on 2 Apr 2026] Title:IndoorCrowd: A Multi-Scene Dataset for Human Detection, Segmentation, and Tracking with an Automated Annotation Pipeline Authors:Sebastian-Ion Nae, Radu Moldoveanu, Alexandra Stefania Ghita, Adina Magda Florea View a PDF of the paper titled IndoorCrowd: A Multi-Scene Dataset for Human Detection, Segmentation, and Tracking with an Automated Annotation Pipeline, by Sebastian-Ion Nae and 3 other authors View PDF HTML (experimental) Abstract:Understanding human behaviour in crowded indoor environments is central to surveillance, smart buildings, and human-robot interaction, yet existing datasets rarely capture real-world indoor complexity at scale. We introduce IndoorCrowd, a multi-scene dataset for indoor human detection, instance segmentation, and multi-object tracking, collected across four campus locations (ACS-EC, ACS-EG, IE-Central, R-Central). It comprises $31$ videos ($9{,}913$ frames at $5$fps) with human-verified, per-instance segmentation masks. A $620$-frame control subset benchmarks three foundation-model auto-annotators: SAM3, GroundingSAM, and EfficientGroundingSAM, against human labels using Cohen's $\kappa$, AP, precision, recall, and mask IoU. A further $2{,}552$-frame subset supports multi-object tracking with continuous identity tracks in MOTChallenge format. We establish detection, segmentation, and tracking baselines using YOLOv8n, YOLOv26n, an...

Originally published on April 03, 2026. Curated by AI News.

Related Articles

[2507.14221] Fair Representation in Parliamentary Summaries: Measuring and Mitigating Inclusion Bias
Llms

[2507.14221] Fair Representation in Parliamentary Summaries: Measuring and Mitigating Inclusion Bias

Abstract page for arXiv paper 2507.14221: Fair Representation in Parliamentary Summaries: Measuring and Mitigating Inclusion Bias

arXiv - Machine Learning · 4 min ·
Machine Learning

[D] On-Device Real-Time Visibility Restoration: Deterministic CV vs. Quantized ML Models. Looking for insights on Edge Preservation vs. Latency.

Hey everyone, We have been working on a real-time camera engine for iOS that currently uses a purely deterministic Computer Vision approa...

Reddit - Machine Learning · 1 min ·
[2512.02413] Enhancing Floor Plan Recognition: A Hybrid Mix-Transformer and U-Net Approach for Precise Wall Segmentation
Machine Learning

[2512.02413] Enhancing Floor Plan Recognition: A Hybrid Mix-Transformer and U-Net Approach for Precise Wall Segmentation

Abstract page for arXiv paper 2512.02413: Enhancing Floor Plan Recognition: A Hybrid Mix-Transformer and U-Net Approach for Precise Wall ...

arXiv - AI · 4 min ·
[2511.14702] Seeing Beyond the Image: ECG and Anatomical Knowledge-Guided Myocardial Scar Segmentation from Late Gadolinium-Enhanced Images
Computer Vision

[2511.14702] Seeing Beyond the Image: ECG and Anatomical Knowledge-Guided Myocardial Scar Segmentation from Late Gadolinium-Enhanced Images

Abstract page for arXiv paper 2511.14702: Seeing Beyond the Image: ECG and Anatomical Knowledge-Guided Myocardial Scar Segmentation from ...

arXiv - AI · 4 min ·
More in Computer Vision: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime