[2510.24318] Transformers can do Bayesian Clustering
Summary
The paper presents Cluster-PFN, a Transformer-based model for unsupervised Bayesian clustering, demonstrating improved accuracy and speed over traditional methods.
Why It Matters
This research addresses the challenges of Bayesian clustering, particularly in handling uncertainty and missing data. By leveraging Transformer architectures, it offers a scalable solution that outperforms existing methods, making it relevant for fields like data science and machine learning.
Key Takeaways
- Cluster-PFN improves Bayesian clustering accuracy and speed.
- It effectively handles missing data, outperforming imputation methods.
- The model estimates the number of clusters more reliably than traditional methods.
Computer Science > Machine Learning arXiv:2510.24318 (cs) [Submitted on 28 Oct 2025 (v1), last revised 18 Feb 2026 (this version, v3)] Title:Transformers can do Bayesian Clustering Authors:Prajit Bhaskaran, Tom Viering View a PDF of the paper titled Transformers can do Bayesian Clustering, by Prajit Bhaskaran and Tom Viering View PDF HTML (experimental) Abstract:Bayesian clustering accounts for uncertainty but is computationally demanding at scale. Furthermore, real-world datasets often contain missing values, and simple imputation ignores the associated uncertainty, resulting in suboptimal results. We present Cluster-PFN, a Transformer-based model that extends Prior-Data Fitted Networks (PFNs) to unsupervised Bayesian clustering. Trained entirely on synthetic datasets generated from a finite Gaussian Mixture Model (GMM) prior, Cluster-PFN learns to estimate the posterior distribution over both the number of clusters and the cluster assignments. Our method estimates the number of clusters more accurately than handcrafted model selection procedures such as AIC, BIC and Variational Inference (VI), and achieves clustering quality competitive with VI while being orders of magnitude faster. Cluster-PFN can be trained on complex priors that include missing data, outperforming imputation-based baselines on real-world genomic datasets, at high missingness. These results show that the Cluster-PFN can provide scalable and flexible Bayesian clustering. Subjects: Machine Learning (cs....