[2512.14990] Imitation Game: Reproducing Deep Learning Bugs Leveraging an Intelligent Agent
Summary
The paper presents RepGen, an intelligent agent designed to automate the reproduction of deep learning bugs, achieving an 80.19% success rate, significantly improving upon existing methods.
Why It Matters
Deep learning applications are increasingly prevalent but often suffer from bugs that are hard to reproduce due to their complexity. RepGen addresses this challenge, enhancing the reliability of deep learning systems and potentially reducing development time and cognitive load for engineers.
Key Takeaways
- RepGen automates the reproduction of deep learning bugs, achieving an 80.19% success rate.
- The approach improves bug reproduction success by 23.35% and reduces reproduction time by 56.8%.
- Manual reproduction of deep learning bugs is highly unreliable, with only 3% being reproducible.
- RepGen utilizes a learning-enhanced context and an iterative generate-validate-refine mechanism.
- The study involved 27 developers, highlighting the cognitive load reduction when using RepGen.
Computer Science > Software Engineering arXiv:2512.14990 (cs) [Submitted on 17 Dec 2025 (v1), last revised 26 Feb 2026 (this version, v3)] Title:Imitation Game: Reproducing Deep Learning Bugs Leveraging an Intelligent Agent Authors:Mehil B Shah, Mohammad Masudur Rahman, Foutse Khomh View a PDF of the paper titled Imitation Game: Reproducing Deep Learning Bugs Leveraging an Intelligent Agent, by Mehil B Shah and 2 other authors View PDF HTML (experimental) Abstract:Despite their wide adoption in various domains (e.g., healthcare, finance, software engineering), Deep Learning (DL)-based applications suffer from many bugs, failures, and vulnerabilities. Reproducing these bugs is essential for their resolution, but it is extremely challenging due to the inherent nondeterminism of DL models and their tight coupling with hardware and software environments. According to recent studies, only about 3% of DL bugs can be reliably reproduced using manual approaches. To address these challenges, we present RepGen, a novel, automated, and intelligent approach for reproducing deep learning bugs. RepGen constructs a learning-enhanced context from a project, develops a comprehensive plan for bug reproduction, employs an iterative generate-validate-refine mechanism, and thus generates such code using an LLM that reproduces the bug at hand. We evaluate RepGen on 106 real-world deep learning bugs and achieve a reproduction rate of 80.19%, a 19.81% improvement over the state-of-the-art measure...