[2602.23341] Mean Estimation from Coarse Data: Characterizations and Efficient Algorithms

arXiv - Machine Learning February 27, 2026 4 min read Article

Summary

This article presents efficient algorithms for estimating the mean from coarse data, addressing key questions in Gaussian mean estimation and computational efficiency.

Why It Matters

Mean estimation from coarse data is crucial in various fields, including economics and machine learning, where incomplete information can lead to significant challenges. This research provides clarity on when mean estimation is identifiable and how to achieve it efficiently, which can enhance data analysis methodologies.

Key Takeaways

Identifiability of the mean is established under convex partitions.
Efficient algorithms for mean estimation are proposed.
The study resolves open questions in previous research on coarse data.

Computer Science > Machine Learning arXiv:2602.23341 (cs) [Submitted on 26 Feb 2026] Title:Mean Estimation from Coarse Data: Characterizations and Efficient Algorithms Authors:Alkis Kalavasis, Anay Mehrotra, Manolis Zampetakis, Felix Zhou, Ziyu Zhu View a PDF of the paper titled Mean Estimation from Coarse Data: Characterizations and Efficient Algorithms, by Alkis Kalavasis and 4 other authors View PDF HTML (experimental) Abstract:Coarse data arise when learners observe only partial information about samples; namely, a set containing the sample rather than its exact value. This occurs naturally through measurement rounding, sensor limitations, and lag in economic systems. We study Gaussian mean estimation from coarse data, where each true sample $x$ is drawn from a $d$-dimensional Gaussian distribution with identity covariance, but is revealed only through the set of a partition containing $x$. When the coarse samples, roughly speaking, have ``low'' information, the mean cannot be uniquely recovered from observed samples (i.e., the problem is not identifiable). Recent work by Fotakis, Kalavasis, Kontonis, and Tzamos [FKKT21] established that sample-efficient mean estimation is possible when the unknown mean is identifiable and the partition consists of only convex sets. Moreover, they showed that without convexity, mean estimation becomes NP-hard. However, two fundamental questions remained open: (1) When is the mean identifiable under convex partitions? (2) Is computation...

Read Original Article

Llms

[R] Depth-first pruning transfers: GPT-2 → TinyLlama with stable gains and minimal loss

TL;DR: Removing the right layers (instead of shrinking all layers) makes transformer models ~8–12% smaller with only ~6–8% quality loss, ...

Reddit - Machine Learning · 1 min · 23 minutes ago

Llms

Built a training stability monitor that detects instability before your loss curve shows anything — open sourced the core today

Been working on a weight divergence trajectory curvature approach to detecting neural network training instability. Treats weight updates...

Reddit - Artificial Intelligence · 1 min · 23 minutes ago

Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min · 27 minutes ago

Machine Learning

Improving AI models’ ability to explain their predictions

AI News - General · 9 min · 27 minutes ago