[P] PCA before truncation makes non-Matryoshka embeddings compressible: results on BGE-M3 [P]
About this article
Most embedding models are not Matryoshka-trained, so naive dimension truncation tends to destroy them. I tested a simple alternative: fit PCA once on a sample of embeddings, rotate vectors into the PCA basis, and then truncate. The idea is that PCA concentrates signal into leading components, so truncation stops being arbitrary. On a 10K-vector BGE-M3 sample (1024d), I got: 512d: naive truncation 0.707 cosine, PCA-first 0.996 384d: naive 0.609, PCA-first 0.990 256d: naive 0.467, PCA-first 0.9...
You've been blocked by network security.To continue, log in to your Reddit account or use your developer tokenIf you think you've been blocked by mistake, file a ticket below and we'll look into it.Log in File a ticket