[2510.27480] Simplex-to-Euclidean Bijections for Categorical Flow Matching
Summary
The paper presents a novel method for learning and sampling from probability distributions on the simplex, utilizing smooth bijections to map to Euclidean space, enhancing categorical data modeling.
Why It Matters
This research addresses the challenge of modeling categorical data effectively by bridging the gap between simplex and Euclidean spaces. It offers a new approach that respects Aitchison geometry, potentially improving performance in various machine learning applications, especially in scenarios involving categorical data.
Key Takeaways
- Introduces a method for mapping simplex to Euclidean space using smooth bijections.
- Utilizes Aitchison geometry for effective categorical data modeling.
- Achieves competitive performance on both synthetic and real-world datasets.
- Facilitates density modeling in Euclidean space while preserving original discrete distributions.
- Offers an alternative to existing methods based on Riemannian geometry.
Computer Science > Machine Learning arXiv:2510.27480 (cs) [Submitted on 31 Oct 2025 (v1), last revised 26 Feb 2026 (this version, v2)] Title:Simplex-to-Euclidean Bijections for Categorical Flow Matching Authors:Bernardo Williams, Victor M. Yeom-Song, Marcelo Hartmann, Arto Klami View a PDF of the paper titled Simplex-to-Euclidean Bijections for Categorical Flow Matching, by Bernardo Williams and 3 other authors View PDF HTML (experimental) Abstract:We propose a method for learning and sampling from probability distributions supported on the simplex. Our approach maps the open simplex to Euclidean space via smooth bijections, leveraging the Aitchison geometry to define the mappings, and supports modeling categorical data by a Dirichlet interpolation that dequantizes discrete observations into continuous ones. This enables density modeling in Euclidean space through the bijection while still allowing exact recovery of the original discrete distribution. Compared to previous methods that operate on the simplex using Riemannian geometry or custom noise processes, our approach works in Euclidean space while respecting the Aitchison geometry, and achieves competitive performance on both synthetic and real-world data sets. Subjects: Machine Learning (cs.LG) Cite as: arXiv:2510.27480 [cs.LG] (or arXiv:2510.27480v2 [cs.LG] for this version) https://doi.org/10.48550/arXiv.2510.27480 Focus to learn more arXiv-issued DOI via DataCite Submission history From: Bernardo Williams More...