[2509.22267] Towards a more realistic evaluation of machine learning models for bearing fault diagnosis
About this article
Abstract page for arXiv paper 2509.22267: Towards a more realistic evaluation of machine learning models for bearing fault diagnosis
Computer Science > Machine Learning arXiv:2509.22267 (cs) [Submitted on 26 Sep 2025 (v1), last revised 3 Mar 2026 (this version, v3)] Title:Towards a more realistic evaluation of machine learning models for bearing fault diagnosis Authors:João Paulo Vieira, Victor Afonso Bauler, Rodrigo Kobashikawa Rosa, Danilo Silva View a PDF of the paper titled Towards a more realistic evaluation of machine learning models for bearing fault diagnosis, by Jo\~ao Paulo Vieira and 3 other authors View PDF HTML (experimental) Abstract:Reliable detection of bearing faults is essential for maintaining the safety and operational efficiency of rotating machinery. While recent advances in machine learning (ML), particularly deep learning, have shown strong performance in controlled settings, many studies fail to generalize to real-world applications due to methodological flaws, most notably data leakage. This paper investigates the issue of data leakage in vibration-based bearing fault diagnosis and its impact on model evaluation. We demonstrate that common dataset partitioning strategies, such as segment-wise and condition-wise splits, introduce spurious correlations that inflate performance metrics. To address this, we propose a rigorous, leakage-free evaluation methodology centered on bearing-wise data partitioning, ensuring no overlap between the physical components used for training and testing. Additionally, we reformulate the classification task as a multi-label problem, enabling the dete...