[2604.01843] Investigating Permutation-Invariant Discrete Representation Learning for Spatially Aligned Images

[2604.01843] Investigating Permutation-Invariant Discrete Representation Learning for Spatially Aligned Images

arXiv - Machine Learning 4 min read

About this article

Abstract page for arXiv paper 2604.01843: Investigating Permutation-Invariant Discrete Representation Learning for Spatially Aligned Images

Computer Science > Computer Vision and Pattern Recognition arXiv:2604.01843 (cs) [Submitted on 2 Apr 2026] Title:Investigating Permutation-Invariant Discrete Representation Learning for Spatially Aligned Images Authors:Jamie S. J. Stirling, Noura Al-Moubayed, Hubert P. H. Shum View a PDF of the paper titled Investigating Permutation-Invariant Discrete Representation Learning for Spatially Aligned Images, by Jamie S. J. Stirling and 2 other authors View PDF HTML (experimental) Abstract:Vector quantization approaches (VQ-VAE, VQ-GAN) learn discrete neural representations of images, but these representations are inherently position-dependent: codes are spatially arranged and contextually entangled, requiring autoregressive or diffusion-based priors to model their dependencies at sample time. In this work, we ask whether positional information is necessary for discrete representations of spatially aligned data. We propose the permutation-invariant vector-quantized autoencoder (PI-VQ), in which latent codes are constrained to carry no positional information. We find that this constraint encourages codes to capture global, semantic features, and enables direct interpolation between images without a learned prior. To address the reduced information capacity of permutation-invariant representations, we introduce matching quantization, a vector quantization algorithm based on optimal bipartite matching that increases effective bottleneck capacity by $3.5\times$ relative to naive ne...

Originally published on April 03, 2026. Curated by AI News.

Related Articles

Machine Learning

HydraLM: 22× faster decoding and 16× smaller state memory in long-context inference experiments [P]

I’ve been experimenting with HydraLM, a long-context model for inference, and the numbers are getting a bit wild: the repo’s benchmark su...

Reddit - Machine Learning · 1 min ·
Machine Learning

How to know if a research-oriented role is for you? [D]

I’m currently a first-year Master’s student in Data Science & AI, and I’m trying to figure out whether a research-oriented career is ...

Reddit - Machine Learning · 1 min ·
Machine Learning

GPU Compass – open-source, real-time GPU pricing across 20+ clouds [P]

We maintain an open-source catalog of cloud GPU offerings (skypilot-catalog, Apache 2.0). It auto-fetches pricing from 20+ cloud APIs eve...

Reddit - Machine Learning · 1 min ·
5 AI Models Tried to Scam Me. Some of Them Were Scary Good | WIRED
Machine Learning

5 AI Models Tried to Scam Me. Some of Them Were Scary Good | WIRED

The cyber capabilities of AI models have experts rattled. AI’s social skills may be just as dangerous.

Wired - AI · 8 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime