[2510.13205] CleverCatch: A Knowledge-Guided Weak Supervision Model for Fraud Detection
Summary
CleverCatch introduces a knowledge-guided weak supervision model for detecting healthcare fraud, enhancing accuracy and interpretability through expert rule integration.
Why It Matters
Healthcare fraud detection is critical yet challenging due to limited labeled data and evolving fraud tactics. CleverCatch addresses these issues by combining domain expertise with machine learning, improving detection accuracy and transparency, which is essential for high-stakes environments like healthcare.
Key Takeaways
- CleverCatch improves fraud detection accuracy by integrating expert rules into machine learning models.
- The model demonstrates a 1.3% improvement in AUC and 3.4% in recall compared to existing methods.
- Combining synthetic data with expert knowledge enhances the model's interpretability and adaptability.
- The approach is particularly relevant for high-stakes domains like healthcare, where transparency is crucial.
- CleverCatch bridges the gap between traditional heuristics and modern machine learning techniques.
Computer Science > Machine Learning arXiv:2510.13205 (cs) [Submitted on 15 Oct 2025 (v1), last revised 22 Feb 2026 (this version, v2)] Title:CleverCatch: A Knowledge-Guided Weak Supervision Model for Fraud Detection Authors:Amirhossein Mozafari, Kourosh Hashemi, Erfan Shafagh, Soroush Motamedi, Azar Taheri Tayebi, Mohammad A. Tayebi View a PDF of the paper titled CleverCatch: A Knowledge-Guided Weak Supervision Model for Fraud Detection, by Amirhossein Mozafari and 4 other authors View PDF HTML (experimental) Abstract:Healthcare fraud detection remains a critical challenge due to limited availability of labeled data, constantly evolving fraud tactics, and the high dimensionality of medical records. Traditional supervised methods are challenged by extreme label scarcity, while purely unsupervised approaches often fail to capture clinically meaningful anomalies. In this work, we introduce CleverCatch, a knowledge-guided weak supervision model designed to detect fraudulent prescription behaviors with improved accuracy and interpretability. Our approach integrates structured domain expertise into a neural architecture that aligns rules and data samples within a shared embedding space. By training encoders jointly on synthetic data representing both compliance and violation, CleverCatch learns soft rule embeddings that generalize to complex, real-world datasets. This hybrid design enables data-driven learning to be enhanced by domain-informed constraints, bridging the gap betwe...