[2603.00210] Universal NP-Hardness of Clustering under General Utilities
About this article
Abstract page for arXiv paper 2603.00210: Universal NP-Hardness of Clustering under General Utilities
Computer Science > Computational Complexity arXiv:2603.00210 (cs) [Submitted on 27 Feb 2026] Title:Universal NP-Hardness of Clustering under General Utilities Authors:Angshul Majumdar View a PDF of the paper titled Universal NP-Hardness of Clustering under General Utilities, by Angshul Majumdar View PDF HTML (experimental) Abstract:Clustering is a central primitive in unsupervised learning, yet practice is dominated by heuristics whose outputs can be unstable and highly sensitive to representations, hyperparameters, and initialisation. Existing theoretical results are largely objective-specific and do not explain these behaviours at a unifying level. We formalise the common optimisation core underlying diverse clustering paradigms by defining the Universal Clustering Problem (UCP): the maximisation of a polynomial-time computable partition utility over a finite metric space. We prove the NP-hardness of UCP via two independent polynomial-time reductions from graph colouring and from exact cover by 3-sets (X3C). By mapping ten major paradigms -- including k-means, GMMs, DBSCAN, spectral clustering, and affinity propagation -- to the UCP framework, we demonstrate that each inherits this fundamental intractability. Our results provide a unified explanation for characteristic failure modes, such as local optima in alternating methods and greedy merge-order traps in hierarchical clustering. Finally, we show that clustering limitations reflect interacting computational and episte...