[2512.07805] Group Representational Position Encoding

[2512.07805] Group Representational Position Encoding

arXiv - Machine Learning 4 min read Article

Summary

The paper introduces GRAPE (Group Representational Position Encoding), a framework for positional encoding that integrates multiplicative rotations and additive logit biases, enhancing long-context models in machine learning.

Why It Matters

GRAPE offers a unified approach to positional encoding, which is crucial for improving the performance of models dealing with long sequences. By integrating existing methods like RoPE and ALiBi, it provides a more comprehensive framework that can enhance various applications in natural language processing and beyond.

Key Takeaways

  • GRAPE combines multiplicative and additive positional encoding methods.
  • It extends the capabilities of existing models like RoPE and ALiBi.
  • The framework supports efficient processing of long-context data.
  • It introduces a principled design space for positional geometry.
  • The approach can enhance feature coupling in complex models.

Computer Science > Machine Learning arXiv:2512.07805 (cs) [Submitted on 8 Dec 2025 (v1), last revised 19 Feb 2026 (this version, v3)] Title:Group Representational Position Encoding Authors:Yifan Zhang, Zixiang Chen, Yifeng Liu, Zhen Qin, Huizhuo Yuan, Kangping Xu, Yang Yuan, Quanquan Gu, Andrew Chi-Chih Yao View a PDF of the paper titled Group Representational Position Encoding, by Yifan Zhang and 8 other authors View PDF HTML (experimental) Abstract:We present GRAPE (Group Representational Position Encoding), a unified framework for positional encoding based on group actions. GRAPE unifies two families of mechanisms: (i) multiplicative rotations (Multiplicative GRAPE) in $\operatorname{SO}(d)$ and (ii) additive logit biases (Additive GRAPE) arising from unipotent actions in the general linear group $\mathrm{GL}$. In Multiplicative GRAPE, a position $n \in \mathbb{Z}$ (or $t \in \mathbb{R}$) acts as $\mathbf{G}(n) = \exp(n \, \omega \, \mathbf{L})$ with a rank-2 skew-symmetric generator $\mathbf{L} \in \mathbb{R}^{d \times d}$, yielding a relative, compositional, norm-preserving map with a closed-form matrix exponential. RoPE is recovered exactly when the $d/2$ planes correspond to canonical coordinate pairs with a log-uniform spectrum. Learned commuting subspaces and compact non-commuting mixtures strictly extend this geometry to capture cross-subspace feature coupling at $O(d)$ and $O(r d)$ cost per head, respectively. In Additive GRAPE, additive logits arise from rank-1...

Related Articles

Ai Safety

NHS staff resist using Palantir software. Staff reportedly cite ethics concerns, privacy worries, and doubt the platform adds much

submitted by /u/esporx [link] [comments]

Reddit - Artificial Intelligence · 1 min ·
Machine Learning

AI assistants are optimized to seem helpful. That is not the same thing as being helpful.

RLHF trains models on human feedback. Humans rate responses they like. And it turns out humans consistently rate confident, fluent, agree...

Reddit - Artificial Intelligence · 1 min ·
Computer Vision

House Democrat Questions Anthropic on AI Safety After Source Code Leak

Rep. Josh Gottheimer, who is generally tough on China, just sent a letter to Anthropic questioning their decision to reduce certain safet...

Reddit - Artificial Intelligence · 1 min ·
[2512.21106] Semantic Refinement with LLMs for Graph Representations
Llms

[2512.21106] Semantic Refinement with LLMs for Graph Representations

Abstract page for arXiv paper 2512.21106: Semantic Refinement with LLMs for Graph Representations

arXiv - Machine Learning · 4 min ·
More in Ai Safety: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime