[2509.08625] An upper bound on the silhouette evaluation metric for clustering
About this article
Abstract page for arXiv paper 2509.08625: An upper bound on the silhouette evaluation metric for clustering
Computer Science > Machine Learning arXiv:2509.08625 (cs) [Submitted on 10 Sep 2025 (v1), last revised 20 Mar 2026 (this version, v5)] Title:An upper bound on the silhouette evaluation metric for clustering Authors:Hugo Sträng, Tai Dinh View a PDF of the paper titled An upper bound on the silhouette evaluation metric for clustering, by Hugo Str\"ang and 1 other authors View PDF Abstract:The silhouette coefficient quantifies, for each observation, the balance between within-cluster cohesion and between-cluster separation, taking values in the range [-1,1]. The average silhouette width (ASW) is a widely used internal measure of clustering quality, with higher values indicating more cohesive and well-separated clusters. However, the dataset-specific maximum of ASW is typically unknown, and the standard upper limit of 1 is rarely attainable. In this work, we derive for each data point a sharp upper bound on its silhouette width and aggregate these to obtain a canonical upper bound for the ASW. This bound-often substantially below 1-enhances the interpretability of empirical ASW values by providing guidance on how close a given clustering result is to the best possible outcome for that dataset. We evaluate the usefulness of the upper bound on a variety of datasets and conclude that it can meaningfully enrich cluster quality evaluation; however, its practical relevance depends on the specific dataset. Finally, we extend the framework to establish an upper bound for the macro-ave...