[2512.09275] Impact of Positional Encoding: Clean and Adversarial Rademacher Complexity for Transformers under In-Context Regression

[2512.09275] Impact of Positional Encoding: Clean and Adversarial Rademacher Complexity for Transformers under In-Context Regression

arXiv - Machine Learning 3 min read

About this article

Abstract page for arXiv paper 2512.09275: Impact of Positional Encoding: Clean and Adversarial Rademacher Complexity for Transformers under In-Context Regression

Statistics > Machine Learning arXiv:2512.09275 (stat) [Submitted on 10 Dec 2025 (v1), last revised 24 Mar 2026 (this version, v2)] Title:Impact of Positional Encoding: Clean and Adversarial Rademacher Complexity for Transformers under In-Context Regression Authors:Weiyi He, Yue Xing View a PDF of the paper titled Impact of Positional Encoding: Clean and Adversarial Rademacher Complexity for Transformers under In-Context Regression, by Weiyi He and 1 other authors View PDF HTML (experimental) Abstract:Positional encoding (PE) is a core architectural component of Transformers, yet its impact on the Transformer's generalization and robustness remains unclear. In this work, we provide the first generalization analysis for a single-layer Transformer under in-context regression that explicitly accounts for a completely trainable PE module. Our result shows that PE systematically enlarges the generalization gap. Extending to the adversarial setting, we derive the adversarial Rademacher generalization bound. We find that the gap between models with and without PE is magnified under attack, demonstrating that PE amplifies the vulnerability of models. Our bounds are empirically validated by a simulation study. Together, this work establishes a new framework for understanding the clean and adversarial generalization in ICL with PE. Comments: Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG) Cite as: arXiv:2512.09275 [stat.ML]   (or arXiv:2512.09275v2 [stat.ML] for this v...

Originally published on March 25, 2026. Curated by AI News.

Related Articles

Machine Learning

[D] ML researcher looking to switch to a product company.

Hey, I am an AI researcher currently working in a deep tech company as a data scientist. Prior to this, I was doing my PhD. My current ro...

Reddit - Machine Learning · 1 min ·
Machine Learning

Building behavioural response models of public figures using Brain scan data (Predict their next move using psychological modelling) [P]

Hey guys, I’m the same creator of Netryx V2, the geolocation tool. I’ve been working on something new called COGNEX. It learns how a pers...

Reddit - Machine Learning · 1 min ·
Machine Learning

[P] bitnet-edge: Ternary-weight CNNs ({-1,0,+1}) on MNIST and CIFAR-10, deployed to ESP32-S3 with zero multiplications

I built a pipeline that takes ternary-quantized CNNs from PyTorch training all the way to bare-metal inference on an ESP32-S3 microcontro...

Reddit - Machine Learning · 1 min ·
Machine Learning

[D] What surprised us while collecting training data from the public web been pulling training data from public web

been pulling training data from public web sources for a bit now. needed it to scale, not return complete garbage, and not immediately bl...

Reddit - Machine Learning · 1 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime