[2507.17316] Nearly Minimax Discrete Distribution Estimation in Kullback-Leibler Divergence with High Probability

arXiv - Machine Learning February 23, 2026 4 min read Article

Summary

This paper presents a comprehensive study on estimating discrete distributions using Kullback-Leibler divergence, establishing minimax rates and exploring the challenges in achieving tight bounds.

Why It Matters

Understanding minimax estimation in Kullback-Leibler divergence is crucial for statisticians and machine learning practitioners, as it impacts how we assess and improve models' predictive accuracy. This research highlights the complexities involved in achieving optimal estimation rates, which can inform future methodologies in statistical learning.

Key Takeaways

Estimates the minimax rate for discrete distribution estimation in Kullback-Leibler divergence.
Identifies the challenges in achieving tight lower bounds compared to other divergence measures.
Introduces a novel estimator through online-to-batch conversion techniques.
Discusses the implications of outcome probabilities on minimax estimation difficulty.
Highlights the gap in minimax Kullback-Leibler estimation compared to asymptotic estimation.

Statistics > Machine Learning arXiv:2507.17316 (stat) [Submitted on 23 Jul 2025 (v1), last revised 20 Feb 2026 (this version, v3)] Title:Nearly Minimax Discrete Distribution Estimation in Kullback-Leibler Divergence with High Probability Authors:Dirk van der Hoeven, Julia Olkhovskaia, Tim van Erven View a PDF of the paper titled Nearly Minimax Discrete Distribution Estimation in Kullback-Leibler Divergence with High Probability, by Dirk van der Hoeven and 2 other authors View PDF Abstract:We consider the fundamental problem of estimating a discrete distribution on a domain of size $K$ with high probability in Kullback-Leibler divergence. We provide upper and lower bounds on the minimax estimation rate, which show that the optimal rate is between $\big(K + \ln(K)\ln(1/\delta)\big) /n$ and $\big(K\ln\ln(K) + \ln(K)\ln(1/\delta)\big) /n$ at error probability $\delta$ and sample size $n$, which pins down the rate up to the doubly logarithmic factor $\ln \ln K$ that multiplies $K$. Our upper bound uses techniques from online learning to construct a novel estimator via online-to-batch conversion. Perhaps surprisingly, the tail behavior of the minimax rate is worse than for the squared total variation and squared Hellinger distance, for which it is $\big(K + \ln(1/\delta)\big) /n$, i.e. without the $\ln K$ multiplying $\ln (1/\delta)$. As a consequence, we cannot obtain a fully tight lower bound from the usual reduction to these smaller distances. Moreover, we show that this lowe...

Read Original Article

[2507.17316] Nearly Minimax Discrete Distribution Estimation in Kullback-Leibler Divergence with High Probability

Summary

Why It Matters

Key Takeaways

Related Articles

[HIRING] Machine Learning Evaluation Specialist | Remote | $50/hr

Japan is adopting robotics and physical AI, with a model where startups innovate and corporations provide scale

mining hardware doing AI training - is the output actually useful

AI is changing how small online sellers decide what to make | MIT Technology Review

No comments

Stay updated with AI News