[2604.03473] Evolutionary Search for Automated Design of Uncertainty Quantification Methods
About this article
Abstract page for arXiv paper 2604.03473: Evolutionary Search for Automated Design of Uncertainty Quantification Methods
Computer Science > Computation and Language arXiv:2604.03473 (cs) [Submitted on 3 Apr 2026] Title:Evolutionary Search for Automated Design of Uncertainty Quantification Methods Authors:Mikhail Seleznyov, Daniil Korbut, Viktor Moskvoretskii, Oleg Somov, Alexander Panchenko, Elena Tutubalina View a PDF of the paper titled Evolutionary Search for Automated Design of Uncertainty Quantification Methods, by Mikhail Seleznyov and 5 other authors View PDF HTML (experimental) Abstract:Uncertainty quantification (UQ) methods for large language models are predominantly designed by hand based on domain knowledge and heuristics, limiting their scalability and generality. We apply LLM-powered evolutionary search to automatically discover unsupervised UQ methods represented as Python programs. On the task of atomic claim verification, our evolved methods outperform strong manually-designed baselines, achieving up to 6.7% relative ROC-AUC improvement across 9 datasets while generalizing robustly out-of-distribution. Qualitative analysis reveals that different LLMs employ qualitatively distinct evolutionary strategies: Claude models consistently design high-feature-count linear estimators, while Gpt-oss-120B gravitates toward simpler and more interpretable positional weighting schemes. Surprisingly, only Sonnet 4.5 and Opus 4.5 reliably leverage increased method complexity to improve performance -- Opus 4.6 shows an unexpected regression relative to its predecessor. Overall, our results in...