[2602.15552] Latent Regularization in Generative Test Input Generation
Summary
This paper explores the effects of latent space regularization on the quality of generative test inputs for deep learning classifiers, demonstrating improved fault detection rates using style-based GANs.
Why It Matters
As deep learning models are increasingly deployed in critical applications, ensuring their robustness through effective testing is vital. This study highlights innovative techniques for generating diverse and valid test inputs, which can enhance model reliability and safety.
Key Takeaways
- Latent regularization through truncation can significantly improve the quality of generated test inputs.
- Style-based GANs were used to evaluate the effectiveness of different truncation strategies.
- Latent code mixing outperformed random truncation in terms of fault detection, diversity, and validity.
- The study utilized three datasets: MNIST, Fashion MNIST, and CIFAR-10 for comprehensive evaluation.
- Improving test input generation is crucial for enhancing the robustness of deep learning classifiers.
Computer Science > Software Engineering arXiv:2602.15552 (cs) [Submitted on 17 Feb 2026] Title:Latent Regularization in Generative Test Input Generation Authors:Giorgi Merabishvili, Oliver Weißl, Andrea Stocco View a PDF of the paper titled Latent Regularization in Generative Test Input Generation, by Giorgi Merabishvili and 2 other authors View PDF HTML (experimental) Abstract:This study investigates the impact of regularization of latent spaces through truncation on the quality of generated test inputs for deep learning classifiers. We evaluate this effect using style-based GANs, a state-of-the-art generative approach, and assess quality along three dimensions: validity, diversity, and fault detection. We evaluate our approach on the boundary testing of deep learning image classifiers across three datasets, MNIST, Fashion MNIST, and CIFAR-10. We compare two truncation strategies: latent code mixing with binary search optimization and random latent truncation for generative exploration. Our experiments show that the latent code-mixing approach yields a higher fault detection rate than random truncation, while also improving both diversity and validity. Comments: Subjects: Software Engineering (cs.SE); Machine Learning (cs.LG) Cite as: arXiv:2602.15552 [cs.SE] (or arXiv:2602.15552v1 [cs.SE] for this version) https://doi.org/10.48550/arXiv.2602.15552 Focus to learn more arXiv-issued DOI via DataCite (pending registration) Submission history From: Andrea Stocco [view ema...