[2603.26799] Gaussian Joint Embeddings For Self-Supervised Representation Learning
About this article
Abstract page for arXiv paper 2603.26799: Gaussian Joint Embeddings For Self-Supervised Representation Learning
Computer Science > Machine Learning arXiv:2603.26799 (cs) [Submitted on 26 Mar 2026] Title:Gaussian Joint Embeddings For Self-Supervised Representation Learning Authors:Yongchao Huang View a PDF of the paper titled Gaussian Joint Embeddings For Self-Supervised Representation Learning, by Yongchao Huang View PDF HTML (experimental) Abstract:Self-supervised representation learning often relies on deterministic predictive architectures to align context and target views in latent space. While effective in many settings, such methods are limited in genuinely multi-modal inverse problems, where squared-loss prediction collapses towards conditional averages, and they frequently depend on architectural asymmetries to prevent representation collapse. In this work, we propose a probabilistic alternative based on generative joint modeling. We introduce Gaussian Joint Embeddings (GJE) and its multi-modal extension, Gaussian Mixture Joint Embeddings (GMJE), which model the joint density of context and target representations and replace black-box prediction with closed-form conditional inference under an explicit probabilistic model. This yields principled uncertainty estimates and a covariance-aware objective for controlling latent geometry. We further identify a failure mode of naive empirical batch optimization, which we term the Mahalanobis Trace Trap, and develop several remedies spanning parametric, adaptive, and non-parametric settings, including prototype-based GMJE, conditional...