[2602.18504] A Computer Vision Framework for Multi-Class Detection and Tracking in Soccer Broadcast Footage
Summary
This paper presents a computer vision framework for detecting and tracking players and the ball in soccer broadcast footage using a single-camera setup, enabling affordable analytics for lower-budget teams.
Why It Matters
The research addresses a significant gap in sports analytics by proposing a cost-effective solution for soccer teams lacking advanced tracking technologies. By utilizing standard broadcast footage, the framework democratizes access to performance data, potentially transforming how amateur and collegiate teams analyze games.
Key Takeaways
- The framework combines YOLO object detection with ByteTrack for effective player tracking.
- High precision and recall scores indicate strong performance in detecting players and officials.
- Ball detection remains challenging, highlighting areas for future improvement.
- The approach reduces reliance on expensive hardware, making analytics accessible to more teams.
- AI can extract valuable spatial information from standard broadcast footage.
Computer Science > Computer Vision and Pattern Recognition arXiv:2602.18504 (cs) [Submitted on 17 Feb 2026] Title:A Computer Vision Framework for Multi-Class Detection and Tracking in Soccer Broadcast Footage Authors:Daniel Tshiani View a PDF of the paper titled A Computer Vision Framework for Multi-Class Detection and Tracking in Soccer Broadcast Footage, by Daniel Tshiani View PDF Abstract:Clubs with access to expensive multi-camera setups or GPS tracking systems gain a competitive advantage through detailed data, whereas lower-budget teams are often unable to collect similar information. This paper examines whether such data can instead be extracted directly from standard broadcast footage using a single-camera computer vision pipeline. This project develops an end-to-end system that combines a YOLO object detector with the ByteTrack tracking algorithm to identify and track players, referees, goalkeepers, and the ball throughout a match. Experimental results show that the pipeline achieves high performance in detecting and tracking players and officials, with strong precision, recall, and mAP50 scores, while ball detection remains the primary challenge. Despite this limitation, our findings demonstrate that AI can extract meaningful player-level spatial information from a single broadcast camera. By reducing reliance on specialized hardware, the proposed approach enables colleges, academies, and amateur clubs to adopt scalable, data-driven analysis methods previously ac...