[2509.09192] ReDef: Do Code Language Models Truly Understand Code Changes for Just-in-Time Software Defect Prediction?
About this article
Abstract page for arXiv paper 2509.09192: ReDef: Do Code Language Models Truly Understand Code Changes for Just-in-Time Software Defect Prediction?
Computer Science > Software Engineering arXiv:2509.09192 (cs) [Submitted on 11 Sep 2025 (v1), last revised 2 Apr 2026 (this version, v2)] Title:ReDef: Do Code Language Models Truly Understand Code Changes for Just-in-Time Software Defect Prediction? Authors:Doha Nam, Taehyoun Kim, Duksan Ryu, Jongmoon Baik View a PDF of the paper titled ReDef: Do Code Language Models Truly Understand Code Changes for Just-in-Time Software Defect Prediction?, by Doha Nam and 3 other authors View PDF HTML (experimental) Abstract:Just-in-Time software defect prediction (JIT-SDP) plays a critical role in prioritizing risky code changes during code review and continuous integration. However, existing datasets often suffer from noisy labels and low precision in identifying bug-inducing commits. To address this, we present ReDef (Revert-based Defect dataset), a high-confidence benchmark of function-level modifications curated from 22 large-scale C/C++ projects. Defective cases are anchored by revert commits, while clean cases are validated through post-hoc history checks. Ambiguous instances are conservatively filtered out via a GPT-assisted triage process involving multiple votes and audits. This pipeline yields 3,164 defective and 10,268 clean modifications, offering substantially more reliable labels than prior resources. Beyond dataset construction, we provide a systematic evaluation of how Code Language Models (CLMs)-specifically CodeBERT, CodeT5+, UniXcoder, and Qwen2.5-reason about code mo...