[2510.07746] t-SNE Exaggerates Clusters, Provably
About this article
Abstract page for arXiv paper 2510.07746: t-SNE Exaggerates Clusters, Provably
Computer Science > Machine Learning arXiv:2510.07746 (cs) [Submitted on 9 Oct 2025 (v1), last revised 2 Mar 2026 (this version, v2)] Title:t-SNE Exaggerates Clusters, Provably Authors:Noah Bergam, Szymon Snoeck, Nakul Verma View a PDF of the paper titled t-SNE Exaggerates Clusters, Provably, by Noah Bergam and 2 other authors View PDF HTML (experimental) Abstract:Central to the widespread use of t-distributed stochastic neighbor embedding (t-SNE) is the conviction that it produces visualizations whose structure roughly matches that of the input. To the contrary, we prove that (1) the strength of the input clustering, and (2) the extremity of outlier points, cannot be reliably inferred from the t-SNE output. We demonstrate the prevalence of these failure modes in practice as well. Comments: Subjects: Machine Learning (cs.LG) Cite as: arXiv:2510.07746 [cs.LG] (or arXiv:2510.07746v2 [cs.LG] for this version) https://doi.org/10.48550/arXiv.2510.07746 Focus to learn more arXiv-issued DOI via DataCite Submission history From: Noah Bergam [view email] [v1] Thu, 9 Oct 2025 03:34:36 UTC (4,411 KB) [v2] Mon, 2 Mar 2026 05:39:53 UTC (5,452 KB) Full-text links: Access Paper: View a PDF of the paper titled t-SNE Exaggerates Clusters, Provably, by Noah Bergam and 2 other authorsView PDFHTML (experimental)TeX Source view license Current browse context: cs.LG < prev | next > new | recent | 2025-10 Change to browse by: cs References & Citations NASA ADSGoogle Scholar Semantic Schol...