[2511.19476] FAST: Topology-Aware Frequency-Domain Distribution Matching for Coreset Selection

[2511.19476] FAST: Topology-Aware Frequency-Domain Distribution Matching for Coreset Selection

arXiv - Machine Learning 4 min read Article

Summary

The paper presents FAST, a novel coreset selection framework that utilizes topology-aware frequency-domain distribution matching, significantly improving efficiency in deep learning training.

Why It Matters

Coreset selection is crucial for optimizing the training of deep neural networks by reducing data size while maintaining performance. This research addresses limitations in existing methods, offering a more effective and energy-efficient solution that could impact various machine learning applications.

Key Takeaways

  • FAST introduces a DNN-free framework for coreset selection based on spectral graph theory.
  • The method employs Characteristic Function Distance (CFD) to enhance distributional matching.
  • It achieves an average accuracy gain of 9.12% over existing methods and reduces power consumption by 96.57%.
  • The Progressive Discrepancy-Aware Sampling strategy improves convergence and efficiency.
  • FAST demonstrates significant speed improvements, achieving a 2.2x average speedup in training.

Statistics > Machine Learning arXiv:2511.19476 (stat) [Submitted on 22 Nov 2025 (v1), last revised 22 Feb 2026 (this version, v2)] Title:FAST: Topology-Aware Frequency-Domain Distribution Matching for Coreset Selection Authors:Boran Zhao, Jin Cui, Jiajun Xu, Jiaqi Guo, Shuo Guan, Pengju Ren View a PDF of the paper titled FAST: Topology-Aware Frequency-Domain Distribution Matching for Coreset Selection, by Boran Zhao and 5 other authors View PDF HTML (experimental) Abstract:Coreset selection compresses large datasets into compact, representative subsets, reducing the energy and computational burden of training deep neural networks. Existing methods are either: (i) DNN-based, which are tied to model-specific parameters and introduce architectural bias; or (ii) DNN-free, which rely on heuristics lacking theoretical guarantees. Neither approach explicitly constrains distributional equivalence, largely because continuous distribution matching is considered inapplicable to discrete sampling. Moreover, prevalent metrics (e.g., MSE, KL, CE, MMD) cannot accurately capture higher-order moment discrepancies, leading to suboptimal coresets. In this work, we propose FAST, the first DNN-free distribution-matching coreset selection framework that formulates the coreset selection task as a graph-constrained optimization problem grounded in spectral graph theory and employs the Characteristic Function Distance (CFD) to capture full distributional information in the frequency domain. We fur...

Related Articles

UMKC Announces New Master of Science in Artificial Intelligence
Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min ·
Machine Learning

AI assistants are optimized to seem helpful. That is not the same thing as being helpful.

RLHF trains models on human feedback. Humans rate responses they like. And it turns out humans consistently rate confident, fluent, agree...

Reddit - Artificial Intelligence · 1 min ·
Llms

wtf bro did what? arc 3 2026

The Physarum Explorer is a high-speed, bio-inspired neural model designed specifically for ARC geometry. Here is the snapshot of its curr...

Reddit - Artificial Intelligence · 1 min ·
Meta Pauses Work With Mercor After Data Breach Puts AI Industry Secrets at Risk | WIRED
Machine Learning

Meta Pauses Work With Mercor After Data Breach Puts AI Industry Secrets at Risk | WIRED

Major AI labs are investigating a security incident that impacted Mercor, a leading data vendor. The incident could have exposed key data...

Wired - AI · 6 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime