[2604.09709] Orthogonal Quadratic Complements for Vision Transformer Feed-Forward Networks
About this article
Abstract page for arXiv paper 2604.09709: Orthogonal Quadratic Complements for Vision Transformer Feed-Forward Networks
Computer Science > Computer Vision and Pattern Recognition arXiv:2604.09709 (cs) [Submitted on 8 Apr 2026] Title:Orthogonal Quadratic Complements for Vision Transformer Feed-Forward Networks Authors:Wang Zixian View a PDF of the paper titled Orthogonal Quadratic Complements for Vision Transformer Feed-Forward Networks, by Wang Zixian View PDF HTML (experimental) Abstract:Recent bilinear feed-forward replacements for vision transformers can substantially improve accuracy, but they often conflate two effects: stronger second-order interactions and increased redundancy relative to the main branch. We study a complementary design principle in which auxiliary quadratic features contribute only information not already captured by the dominant hidden representation. To this end, we propose Orthogonal Quadratic Complements (OQC), which construct a low-rank quadratic auxiliary branch and explicitly project it onto the orthogonal complement of the main branch before injection. We further study an efficient low-rank realization (OQC-LR) and gated extensions (OQC-static and OQC-dynamic). Under a parameter-matched Deep-ViT and CIFAR-100 protocol with a fixed penultimate residual readout, full OQC improves an AFBO baseline from 64.25 +/- 0.22 to 65.59 +/- 0.22, while OQC-LR reaches 65.52 +/- 0.25 with a substantially better speed-accuracy tradeoff. On TinyImageNet, the gated extension OQC-dynamic achieves 51.88 +/- 0.32, improving the baseline (50.45 +/- 0.21) by 1.43 points and outperf...