[2602.19393] In Defense of Cosine Similarity: Normalization Eliminates the Gauge Freedom

[2602.19393] In Defense of Cosine Similarity: Normalization Eliminates the Gauge Freedom

arXiv - Machine Learning 3 min read Article

Summary

This paper defends cosine similarity in machine learning, arguing that normalization eliminates issues related to gauge freedom, thus ensuring accurate distance measurements in embeddings.

Why It Matters

Understanding the validity of cosine similarity is crucial for practitioners in machine learning, particularly those using embeddings. This paper clarifies misconceptions about cosine similarity's reliability when embeddings are properly normalized, which can significantly impact model performance and interpretation.

Key Takeaways

  • Cosine similarity is valid when embeddings are normalized.
  • Normalization removes gauge freedom issues associated with cosine similarity.
  • Cosine distance equates to half the squared Euclidean distance on normalized embeddings.
  • Misinterpretations of cosine similarity stem from incompatible training objectives.
  • Proper normalization leads to identical neighbor rankings in cosine and Euclidean spaces.

Computer Science > Machine Learning arXiv:2602.19393 (cs) [Submitted on 23 Feb 2026] Title:In Defense of Cosine Similarity: Normalization Eliminates the Gauge Freedom Authors:Taha Bouhsine View a PDF of the paper titled In Defense of Cosine Similarity: Normalization Eliminates the Gauge Freedom, by Taha Bouhsine View PDF HTML (experimental) Abstract:Steck, Ekanadham, and Kallus [arXiv:2403.05440] demonstrate that cosine similarity of learned embeddings from matrix factorization models can be rendered arbitrary by a diagonal ``gauge'' matrix $D$. Their result is correct and important for practitioners who compute cosine similarity on embeddings trained with dot-product objectives. However, we argue that their conclusion, cautioning against cosine similarity in general, conflates the pathology of an incompatible training objective with the geometric validity of cosine distance on the unit sphere. We prove that when embeddings are constrained to the unit sphere $\mathbb{S}^{d-1}$ (either during or after training with an appropriate objective), the $D$-matrix ambiguity vanishes identically, and cosine distance reduces to exactly half the squared Euclidean distance. This monotonic equivalence implies that cosine-based and Euclidean-based neighbor rankings are identical on normalized embeddings. The ``problem'' with cosine similarity is not cosine similarity, it is the failure to normalize. Subjects: Machine Learning (cs.LG) Cite as: arXiv:2602.19393 [cs.LG]   (or arXiv:2602.193...

Related Articles

Machine Learning

[D] Best websites for pytorch/numpy interviews

Hello, I’m at the last year of my PHD and I’m starting to prepare interviews. I’m mainly aiming at applied scientist/research engineer or...

Reddit - Machine Learning · 1 min ·
Llms

[P] Remote sensing foundation models made easy to use.

This project enables the idea of tasking remote sensing models to acquire embeddings like we task satellites to acquire data! https://git...

Reddit - Machine Learning · 1 min ·
Machine Learning

Can AI truly be creative?

AI has no imagination. “Creativity is the ability to generate novel and valuable ideas or works through the exercise of imagination” http...

Reddit - Artificial Intelligence · 1 min ·
Machine Learning

AI video generation seems fundamentally more expensive than text, not just less optimized

There’s been a lot of discussion recently about how expensive AI video generation is compared to text, and it feels like this is more tha...

Reddit - Artificial Intelligence · 1 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime