[2604.01466] Efficient Equivariant Transformer for Self-Driving Agent Modeling
About this article
Abstract page for arXiv paper 2604.01466: Efficient Equivariant Transformer for Self-Driving Agent Modeling
Computer Science > Robotics arXiv:2604.01466 (cs) [Submitted on 1 Apr 2026] Title:Efficient Equivariant Transformer for Self-Driving Agent Modeling Authors:Scott Xu, Dian Chen, Kelvin Wong, Chris Zhang, Kion Fallah, Raquel Urtasun View a PDF of the paper titled Efficient Equivariant Transformer for Self-Driving Agent Modeling, by Scott Xu and 5 other authors View PDF HTML (experimental) Abstract:Accurately modeling agent behaviors is an important task in self-driving. It is also a task with many symmetries, such as equivariance to the order of agents and objects in the scene or equivariance to arbitrary roto-translations of the entire scene as a whole; i.e., SE(2)-equivariance. The transformer architecture is a ubiquitous tool for modeling these symmetries. While standard self-attention is inherently permutation equivariant, explicit pairwise relative positional encodings have been the standard for introducing SE(2)-equivariance. However, this approach introduces an additional cost that is quadratic in the number of agents, limiting its scalability to larger scenes and batch sizes. In this work, we propose DriveGATr, a novel transformer-based architecture for agent modeling that achieves SE(2)-equivariance without the computational cost of existing methods. Inspired by recent advances in geometric deep learning, DriveGATr encodes scene elements as multivectors in the 2D projective geometric algebra $\mathbb{R}^*_{2,0,1}$ and processes them with a stack of equivariant trans...