[2603.25860] On the Expressive Power of Contextual Relations in Transformers
About this article
Abstract page for arXiv paper 2603.25860: On the Expressive Power of Contextual Relations in Transformers
Statistics > Machine Learning arXiv:2603.25860 (stat) [Submitted on 26 Mar 2026] Title:On the Expressive Power of Contextual Relations in Transformers Authors:Demián Fraiman View a PDF of the paper titled On the Expressive Power of Contextual Relations in Transformers, by Demi\'an Fraiman View PDF HTML (experimental) Abstract:Transformer architectures have achieved remarkable empirical success in modeling contextual relationships in natural language, yet a precise mathematical characterization of their expressive power remains incomplete. In this work, we introduce a measure-theoretic framework for contextual representations in which texts are modeled as probability measures over a semantic embedding space, and contextual relations between words, are represented as coupling measures between them. Within this setting, we introduce Sinkhorn Transformer, a transformer-like architecture. Our main result is a universal approximation theorem: any continuous coupling function between probability measures, that encodes the semantic relation coupling measure, can be uniformly approximated by a Sinkhorn Transformer with appropriate parameters. Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG) Cite as: arXiv:2603.25860 [stat.ML] (or arXiv:2603.25860v1 [stat.ML] for this version) https://doi.org/10.48550/arXiv.2603.25860 Focus to learn more arXiv-issued DOI via DataCite (pending registration) Submission history From: Demián Fraiman [view email] [v1] Thu, 26 Mar 2026 1...