[2508.14936] Can synthetic data reproduce real-world findings in epidemiology? A replication study using adversarial random forests
About this article
Abstract page for arXiv paper 2508.14936: Can synthetic data reproduce real-world findings in epidemiology? A replication study using adversarial random forests
Quantitative Biology > Quantitative Methods arXiv:2508.14936 (q-bio) [Submitted on 19 Aug 2025 (v1), last revised 23 Mar 2026 (this version, v2)] Title:Can synthetic data reproduce real-world findings in epidemiology? A replication study using adversarial random forests Authors:Jan Kapar, Kathrin Günther, Lori Ann Vallis, Klaus Berger, Nadine Binder, Hermann Brenner, Stefanie Castell, Beate Fischer, Volker Harth, Bernd Holleczek, Timm Intemann, Till Ittermann, André Karch, Thomas Keil, Lilian Krist, Berit Lange, Michael F. Leitzmann, Katharina Nimptsch, Nadia Obi, Iris Pigeot, Tobias Pischon, Tamara Schikowski, Börge Schmidt, Carsten Oliver Schmidt, Anja M. Sedlmair, Justine Tanoey, Harm Wienbergen, Andreas Wienke, Claudia Wigmann, Marvin N. Wright View a PDF of the paper titled Can synthetic data reproduce real-world findings in epidemiology? A replication study using adversarial random forests, by Jan Kapar and 28 other authors View PDF Abstract:Synthetic data holds substantial potential to address practical challenges in epidemiology due to restricted data access and privacy concerns. However, many current methods suffer from limited quality, high computational demands, and complexity for non-experts. Furthermore, common evaluation strategies for synthetic data often fail to directly reflect statistical utility and measure privacy risks sufficiently. Against this background, a critical underexplored question is whether synthetic data can reliably reproduce key findings ...