[2602.13921] GREPO: A Benchmark for Graph Neural Networks on Repository-Level Bug Localization

[2602.13921] GREPO: A Benchmark for Graph Neural Networks on Repository-Level Bug Localization

arXiv - AI 4 min read Article

Summary

The article presents GREPO, a benchmark for evaluating Graph Neural Networks (GNNs) in repository-level bug localization, addressing limitations of current methods and showcasing GNNs' potential.

Why It Matters

This research is significant as it fills a gap in the application of GNNs for bug localization, a critical area in software engineering. By providing a dedicated benchmark, GREPO enables more effective evaluation and advancement of GNN methodologies, potentially improving software maintenance and development processes.

Key Takeaways

  • GREPO is the first benchmark specifically for GNNs in bug localization.
  • It includes 86 Python repositories and 47,294 bug-fixing tasks.
  • GNNs outperform traditional information retrieval methods for this task.
  • The benchmark facilitates future research in GNN applications.
  • Access to the code and data structures is provided for further exploration.

Computer Science > Machine Learning arXiv:2602.13921 (cs) [Submitted on 14 Feb 2026] Title:GREPO: A Benchmark for Graph Neural Networks on Repository-Level Bug Localization Authors:Juntong Wang, Libin Chen, Xiyuan Wang, Shijia Kang, Haotong Yang, Da Zheng, Muhan Zhang View a PDF of the paper titled GREPO: A Benchmark for Graph Neural Networks on Repository-Level Bug Localization, by Juntong Wang and 6 other authors View PDF HTML (experimental) Abstract:Repository-level bug localization-the task of identifying where code must be modified to fix a bug-is a critical software engineering challenge. Standard Large Language Modles (LLMs) are often unsuitable for this task due to context window limitations that prevent them from processing entire code repositories. As a result, various retrieval methods are commonly used, including keyword matching, text similarity, and simple graph-based heuristics such as Breadth-First Search. Graph Neural Networks (GNNs) offer a promising alternative due to their ability to model complex, repository-wide dependencies; however, their application has been hindered by the lack of a dedicated benchmark. To address this gap, we introduce GREPO, the first GNN benchmark for repository-scale bug localization tasks. GREPO comprises 86 Python repositories and 47294 bug-fixing tasks, providing graph-based data structures ready for direct GNN processing. Our evaluation of various GNN architectures shows outstanding performance compared to established info...

Related Articles

Llms

[D] Tested model routing on financial AI datasets — good savings and curious what benchmarks others use.

Ran a benchmark evaluating whether prompt complexity-based routing delivers meaningful savings. Used public HuggingFace datasets. Here's ...

Reddit - Machine Learning · 1 min ·
Llms

[D] AI research on small language models

i'm doing research on some trending fields in AI, currently working on small language models and would love to meet people who are workin...

Reddit - Machine Learning · 1 min ·
Llms

One of The Worst AI's I've Ever Seen

I'm using Gemini just for they gave us a student-free-pro pack. It can't see the images I sent, most of the time it just rewrites the mes...

Reddit - Artificial Intelligence · 1 min ·
Llms

Claude Opus 4.6 API at 40% below Anthropic pricing – try free before you pay anything

Hey everyone 👋 I've set up a self-hosted API gateway using New-API to manage and distribute Claude Opus 4.6 access across multiple users....

Reddit - Artificial Intelligence · 1 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime