[2602.23295] ManifoldGD: Training-Free Hierarchical Manifold Guidance for Diffusion-Based Dataset Distillation
Summary
The paper presents ManifoldGD, a training-free framework for dataset distillation using hierarchical manifold guidance, improving efficiency and fidelity in data generation.
Why It Matters
As datasets grow larger, efficient model training becomes challenging. ManifoldGD addresses this by synthesizing compact datasets that retain essential knowledge while reducing storage needs. Its innovative approach enhances the quality of generated data without retraining, making it significant for machine learning applications.
Key Takeaways
- ManifoldGD offers a training-free method for dataset distillation.
- It integrates manifold consistent guidance at each denoising step.
- The framework improves representativeness, diversity, and image fidelity.
- Empirical results show gains over existing methods in key performance metrics.
- ManifoldGD is the first geometry-aware training-free distillation framework.
Computer Science > Computer Vision and Pattern Recognition arXiv:2602.23295 (cs) [Submitted on 26 Feb 2026] Title:ManifoldGD: Training-Free Hierarchical Manifold Guidance for Diffusion-Based Dataset Distillation Authors:Ayush Roy, Wei-Yang Alex Lee, Rudrasis Chakraborty, Vishnu Suresh Lokhande View a PDF of the paper titled ManifoldGD: Training-Free Hierarchical Manifold Guidance for Diffusion-Based Dataset Distillation, by Ayush Roy and 3 other authors View PDF HTML (experimental) Abstract:In recent times, large datasets hinder efficient model training while also containing redundant concepts. Dataset distillation aims to synthesize compact datasets that preserve the knowledge of large-scale training sets while drastically reducing storage and computation. Recent advances in diffusion models have enabled training-free distillation by leveraging pre-trained generative priors; however, existing guidance strategies remain limited. Current score-based methods either perform unguided denoising or rely on simple mode-based guidance toward instance prototype centroids (IPC centroids), which often are rudimentary and suboptimal. We propose Manifold-Guided Distillation (ManifoldGD), a training-free diffusion-based framework that integrates manifold consistent guidance at every denoising timestep. Our method employs IPCs computed via a hierarchical, divisive clustering of VAE latent features, yielding a multi-scale coreset of IPCs that captures both coarse semantic modes and fine i...