[2506.05634] AutoQD: Automatic Discovery of Diverse Behaviors with Quality-Diversity Optimization
About this article
Abstract page for arXiv paper 2506.05634: AutoQD: Automatic Discovery of Diverse Behaviors with Quality-Diversity Optimization
Computer Science > Machine Learning arXiv:2506.05634 (cs) [Submitted on 5 Jun 2025 (v1), last revised 4 Mar 2026 (this version, v2)] Title:AutoQD: Automatic Discovery of Diverse Behaviors with Quality-Diversity Optimization Authors:Saeed Hedayatian, Stefanos Nikolaidis View a PDF of the paper titled AutoQD: Automatic Discovery of Diverse Behaviors with Quality-Diversity Optimization, by Saeed Hedayatian and 1 other authors View PDF HTML (experimental) Abstract:Quality-Diversity (QD) algorithms have shown remarkable success in discovering diverse, high-performing solutions, but rely heavily on hand-crafted behavioral descriptors that constrain exploration to predefined notions of diversity. Leveraging the equivalence between policies and occupancy measures, we present a theoretically grounded approach to automatically generate behavioral descriptors by embedding the occupancy measures of policies in Markov Decision Processes. Our method, AutoQD, leverages random Fourier features to approximate the Maximum Mean Discrepancy (MMD) between policy occupancy measures, creating embeddings whose distances reflect meaningful behavioral differences. A low-dimensional projection of these embeddings that captures the most behaviorally significant dimensions can then be used as behavioral descriptors for CMA-MAE, a state of the art blackbox QD method, to discover diverse policies. We prove that our embeddings converge to true MMD distances between occupancy measures as the number of sam...