Ai Safety Machine Learning Nlp Ai Agents

[2512.07805] Group Representational Position Encoding

arXiv - Machine Learning February 23, 2026 4 min read Article

Summary

The paper introduces GRAPE (Group Representational Position Encoding), a framework for positional encoding that integrates multiplicative rotations and additive logit biases, enhancing long-context models in machine learning.

Why It Matters

GRAPE offers a unified approach to positional encoding, which is crucial for improving the performance of models dealing with long sequences. By integrating existing methods like RoPE and ALiBi, it provides a more comprehensive framework that can enhance various applications in natural language processing and beyond.

Key Takeaways

GRAPE combines multiplicative and additive positional encoding methods.
It extends the capabilities of existing models like RoPE and ALiBi.
The framework supports efficient processing of long-context data.
It introduces a principled design space for positional geometry.
The approach can enhance feature coupling in complex models.

Computer Science > Machine Learning arXiv:2512.07805 (cs) [Submitted on 8 Dec 2025 (v1), last revised 19 Feb 2026 (this version, v3)] Title:Group Representational Position Encoding Authors:Yifan Zhang, Zixiang Chen, Yifeng Liu, Zhen Qin, Huizhuo Yuan, Kangping Xu, Yang Yuan, Quanquan Gu, Andrew Chi-Chih Yao View a PDF of the paper titled Group Representational Position Encoding, by Yifan Zhang and 8 other authors View PDF HTML (experimental) Abstract:We present GRAPE (Group Representational Position Encoding), a unified framework for positional encoding based on group actions. GRAPE unifies two families of mechanisms: (i) multiplicative rotations (Multiplicative GRAPE) in $\operatorname{SO}(d)$ and (ii) additive logit biases (Additive GRAPE) arising from unipotent actions in the general linear group $\mathrm{GL}$. In Multiplicative GRAPE, a position $n \in \mathbb{Z}$ (or $t \in \mathbb{R}$) acts as $\mathbf{G}(n) = \exp(n \, \omega \, \mathbf{L})$ with a rank-2 skew-symmetric generator $\mathbf{L} \in \mathbb{R}^{d \times d}$, yielding a relative, compositional, norm-preserving map with a closed-form matrix exponential. RoPE is recovered exactly when the $d/2$ planes correspond to canonical coordinate pairs with a log-uniform spectrum. Learned commuting subspaces and compact non-commuting mixtures strictly extend this geometry to capture cross-subspace feature coupling at $O(d)$ and $O(r d)$ cost per head, respectively. In Additive GRAPE, additive logits arise from rank-1...

Read Original Article

[2512.07805] Group Representational Position Encoding

Summary

Why It Matters

Key Takeaways

Related Articles

NHS staff resist using Palantir software. Staff reportedly cite ethics concerns, privacy worries, and doubt the platform adds much

AI assistants are optimized to seem helpful. That is not the same thing as being helpful.

House Democrat Questions Anthropic on AI Safety After Source Code Leak

[2512.21106] Semantic Refinement with LLMs for Graph Representations

No comments

Stay updated with AI News