A Geometric Perspective on Robustness in Vision Transformers [R]
About this article
Hi everyone! I'm sharing a paper I've been working on that investigates how different positional encoding schemes (learned absolute, sinusoidal, and rotary) shape the internal representations of Vision Transformers, and how these representations relate to robustness under distributional shift. Paper PDF: https://github.com/mahmoud-mannes/neurips-geometry-paper/blob/main/paper/main.pdf Abstract: Positional embeddings (PEs) in Vision Transformers (ViTs) are known to impact performance and robus...
You've been blocked by network security.To continue, log in to your Reddit account or use your developer tokenIf you think you've been blocked by mistake, file a ticket below and we'll look into it.Log in File a ticket