[2603.21530] LLM-Based Test Case Generation in DBMS through Monte Carlo Tree Search
About this article
Abstract page for arXiv paper 2603.21530: LLM-Based Test Case Generation in DBMS through Monte Carlo Tree Search
Computer Science > Software Engineering arXiv:2603.21530 (cs) [Submitted on 23 Mar 2026] Title:LLM-Based Test Case Generation in DBMS through Monte Carlo Tree Search Authors:Yujia Chen, Yingli Zhou, Fangyuan Zhang, Cuiyun Gao View a PDF of the paper titled LLM-Based Test Case Generation in DBMS through Monte Carlo Tree Search, by Yujia Chen and 3 other authors View PDF HTML (experimental) Abstract:Database Management Systems (DBMSs) are fundamental infrastructure for modern data-driven applications, where thorough testing with high-quality SQL test cases is essential for ensuring system reliability. Traditional approaches such as fuzzing can be effective for specific DBMSs, but adapting them to different proprietary dialects requires substantial manual effort. Large Language Models (LLMs) present promising opportunities for automated SQL test generation, but face critical challenges in industrial environments. First, lightweight models are widely used in organizations due to security and privacy constraints, but they struggle to generate syntactically valid queries for proprietary SQL dialects. Second, LLM-generated queries are often semantically similar and exercise only shallow execution paths, thereby quickly reaching a coverage plateau. To address these challenges, we propose MIST, an LLM-based test case generatIon framework for DBMS through Monte Carlo Tree search. MIST consists of two stages: Feature-Guided Error-Driven Test Case Synthetization, which constructs a hi...