[2404.12613] Model Selection and Parameter Estimation of One-Dimensional Gaussian Mixture Models
Summary
This paper investigates model selection and parameter estimation for one-dimensional Gaussian mixture models (GMMs), focusing on optimal sampling complexity and a novel Fourier-based estimation approach.
Why It Matters
Understanding Gaussian mixture models is crucial in various fields such as statistics and machine learning. This research provides a foundational framework for accurately estimating model parameters, which can enhance the performance of algorithms in practical applications. The findings also address the critical relationship between sample size and model complexity, offering insights for researchers and practitioners alike.
Key Takeaways
- Establishes optimal sampling complexity for model order estimation in GMMs.
- Proves a lower bound on sample requirements based on component separation.
- Introduces a Fourier-based method that matches theoretical limits for efficiency.
- Demonstrates superior performance of the proposed method over traditional techniques.
- Highlights the importance of accurate model selection in statistical learning.
Statistics > Machine Learning arXiv:2404.12613 (stat) [Submitted on 19 Apr 2024 (v1), last revised 9 Feb 2026 (this version, v3)] Title:Model Selection and Parameter Estimation of One-Dimensional Gaussian Mixture Models Authors:Xinyu Liu, Hai Zhang View a PDF of the paper titled Model Selection and Parameter Estimation of One-Dimensional Gaussian Mixture Models, by Xinyu Liu and 1 other authors View PDF HTML (experimental) Abstract:In this paper, we study the problem of learning one-dimensional Gaussian mixture models (GMMs) with a specific focus on estimating both the model order and the mixing distribution from independent and identically distributed (i.i.d.) samples. This paper establishes the optimal sampling complexity for model order estimation in one-dimensional Gaussian mixture models. We prove a fundamental lower bound on the number of samples required to correctly identify the number of components with high probability, showing that this limit depends critically on the separation between component means and the total number of components. We then propose a Fourier-based approach to estimate both the model order and the mixing distribution. Our algorithm utilizes Fourier measurements constructed from the samples, and our analysis demonstrates that its sample complexity matches the established lower bound, thereby confirming its optimality. Numerical experiments further show that our method outperforms conventional techniques in terms of efficiency and accuracy. Su...