[2512.14990] Imitation Game: Reproducing Deep Learning Bugs Leveraging an Intelligent Agent

[2512.14990] Imitation Game: Reproducing Deep Learning Bugs Leveraging an Intelligent Agent

arXiv - Machine Learning 4 min read Article

Summary

The paper presents RepGen, an intelligent agent designed to automate the reproduction of deep learning bugs, achieving an 80.19% success rate, significantly improving upon existing methods.

Why It Matters

Deep learning applications are increasingly prevalent but often suffer from bugs that are hard to reproduce due to their complexity. RepGen addresses this challenge, enhancing the reliability of deep learning systems and potentially reducing development time and cognitive load for engineers.

Key Takeaways

  • RepGen automates the reproduction of deep learning bugs, achieving an 80.19% success rate.
  • The approach improves bug reproduction success by 23.35% and reduces reproduction time by 56.8%.
  • Manual reproduction of deep learning bugs is highly unreliable, with only 3% being reproducible.
  • RepGen utilizes a learning-enhanced context and an iterative generate-validate-refine mechanism.
  • The study involved 27 developers, highlighting the cognitive load reduction when using RepGen.

Computer Science > Software Engineering arXiv:2512.14990 (cs) [Submitted on 17 Dec 2025 (v1), last revised 26 Feb 2026 (this version, v3)] Title:Imitation Game: Reproducing Deep Learning Bugs Leveraging an Intelligent Agent Authors:Mehil B Shah, Mohammad Masudur Rahman, Foutse Khomh View a PDF of the paper titled Imitation Game: Reproducing Deep Learning Bugs Leveraging an Intelligent Agent, by Mehil B Shah and 2 other authors View PDF HTML (experimental) Abstract:Despite their wide adoption in various domains (e.g., healthcare, finance, software engineering), Deep Learning (DL)-based applications suffer from many bugs, failures, and vulnerabilities. Reproducing these bugs is essential for their resolution, but it is extremely challenging due to the inherent nondeterminism of DL models and their tight coupling with hardware and software environments. According to recent studies, only about 3% of DL bugs can be reliably reproduced using manual approaches. To address these challenges, we present RepGen, a novel, automated, and intelligent approach for reproducing deep learning bugs. RepGen constructs a learning-enhanced context from a project, develops a comprehensive plan for bug reproduction, employs an iterative generate-validate-refine mechanism, and thus generates such code using an LLM that reproduces the bug at hand. We evaluate RepGen on 106 real-world deep learning bugs and achieve a reproduction rate of 80.19%, a 19.81% improvement over the state-of-the-art measure...

Related Articles

Llms

Von Hammerstein’s Ghost: What a Prussian General’s Officer Typology Can Teach Us About AI Misalignment

Greetings all - I've posted mostly in r/claudecode and r/aigamedev a couple of times previously. Working with CC for personal projects re...

Reddit - Artificial Intelligence · 1 min ·
Llms

World models will be the next big thing, bye-bye LLMs

Was at Nvidia's GTC conference recently and honestly, it was one of the most eye-opening events I've attended in a while. There was a lot...

Reddit - Artificial Intelligence · 1 min ·
Machine Learning

[D] Got my first offer after months of searching — below posted range, contract-to-hire, and worried it may pause my search. Do I take it?

I could really use some outside perspective. I’m a senior ML/CV engineer in Canada with about 5–6 years across research and industry. Mas...

Reddit - Machine Learning · 1 min ·
Machine Learning

[Research] AI training is bad, so I started an research

Hello, I started researching about AI training Q:Why? R: Because AI training is bad right now. Q: What do you mean its bad? R: Like when ...

Reddit - Machine Learning · 1 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime