Llms Machine Learning Nlp

[2602.13921] GREPO: A Benchmark for Graph Neural Networks on Repository-Level Bug Localization

arXiv - AI February 17, 2026 4 min read Article

Summary

The article presents GREPO, a benchmark for evaluating Graph Neural Networks (GNNs) in repository-level bug localization, addressing limitations of current methods and showcasing GNNs' potential.

Why It Matters

This research is significant as it fills a gap in the application of GNNs for bug localization, a critical area in software engineering. By providing a dedicated benchmark, GREPO enables more effective evaluation and advancement of GNN methodologies, potentially improving software maintenance and development processes.

Key Takeaways

GREPO is the first benchmark specifically for GNNs in bug localization.
It includes 86 Python repositories and 47,294 bug-fixing tasks.
GNNs outperform traditional information retrieval methods for this task.
The benchmark facilitates future research in GNN applications.
Access to the code and data structures is provided for further exploration.

Computer Science > Machine Learning arXiv:2602.13921 (cs) [Submitted on 14 Feb 2026] Title:GREPO: A Benchmark for Graph Neural Networks on Repository-Level Bug Localization Authors:Juntong Wang, Libin Chen, Xiyuan Wang, Shijia Kang, Haotong Yang, Da Zheng, Muhan Zhang View a PDF of the paper titled GREPO: A Benchmark for Graph Neural Networks on Repository-Level Bug Localization, by Juntong Wang and 6 other authors View PDF HTML (experimental) Abstract:Repository-level bug localization-the task of identifying where code must be modified to fix a bug-is a critical software engineering challenge. Standard Large Language Modles (LLMs) are often unsuitable for this task due to context window limitations that prevent them from processing entire code repositories. As a result, various retrieval methods are commonly used, including keyword matching, text similarity, and simple graph-based heuristics such as Breadth-First Search. Graph Neural Networks (GNNs) offer a promising alternative due to their ability to model complex, repository-wide dependencies; however, their application has been hindered by the lack of a dedicated benchmark. To address this gap, we introduce GREPO, the first GNN benchmark for repository-scale bug localization tasks. GREPO comprises 86 Python repositories and 47294 bug-fixing tasks, providing graph-based data structures ready for direct GNN processing. Our evaluation of various GNN architectures shows outstanding performance compared to established info...

Read Original Article

[2602.13921] GREPO: A Benchmark for Graph Neural Networks on Repository-Level Bug Localization

Summary

Why It Matters

Key Takeaways

Related Articles

[D] Tested model routing on financial AI datasets — good savings and curious what benchmarks others use.

[D] AI research on small language models

One of The Worst AI's I've Ever Seen

Claude Opus 4.6 API at 40% below Anthropic pricing – try free before you pay anything

No comments

Stay updated with AI News