Machine Learning

Is Attention sink without Positional Encoding unavoidable? [D]

Reddit - Machine Learning April 30, 2026 1 min read

About this article

TL;DR: As soon as I remove Positional Encoding (PE) from Self or Cross-attention, I start seeing vertical hot lines in attention heatmaps. Is there any way to make a model have query-conditioned attention without PE? So, I've been trying to pre-train a couple types of Transformer based models (small, tinkering level only), Encoder-Decoder model and Cross-attention memory only model (basically, removing FFNs and using cross-attended vectors as memory banks instead), namely. But every-time I tr...

You've been blocked by network security.To continue, log in to your Reddit account or use your developer tokenIf you think you've been blocked by mistake, file a ticket below and we'll look into it.Log in File a ticket

Originally published on April 30, 2026. Curated by AI News.

Read Original Article

Llms

Comparing SVG generation for top models

These are the top open and closed model: Opus 4.7, GPT-5.5 Pro, DeepSeek V4, GLM-5.1 and Gemini 3.1 Pro. They both show similar performan...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Llms

The more young people use AI, the more they hate it | The Verge

Caught between fears of job loss and social stigma, Gen Z’s opinions of AI are hitting new lows.

The Verge - AI · 13 min · about 2 hours ago

Llms

OpenAI’s new security model is for ‘critical cyber defenders’ only | The Verge

Like Anthropic’s Mythos, GPT-5.5-Cyber will first be released to ‘trusted’ entities.

The Verge - AI · 4 min · about 2 hours ago

Llms

Kimi bad at tool calling? [D]

So I've tried using kimi 2.5 in a personal project through AWS Bedrock. For simple tasks it does quite well. But when it comes to tool ca...

Reddit - Machine Learning · 1 min · about 4 hours ago

More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Is Attention sink without Positional Encoding unavoidable? [D]

About this article

Related Articles

Comparing SVG generation for top models

The more young people use AI, the more they hate it | The Verge

OpenAI’s new security model is for ‘critical cyber defenders’ only | The Verge

Kimi bad at tool calling? [D]

No comments

Stay updated with AI News