[2602.15028] Long Context, Less Focus: A Scaling Gap in LLMs Revealed through Privacy and Personalization

[2602.15028] Long Context, Less Focus: A Scaling Gap in LLMs Revealed through Privacy and Personalization

arXiv - AI 4 min read Article

Summary

The paper examines how increasing context length in large language models (LLMs) affects personalization quality and privacy risks, revealing a scaling gap in performance.

Why It Matters

As LLMs are increasingly used in sensitive applications, understanding the trade-offs between context length, personalization, and privacy is crucial. This research provides a benchmark for evaluating these aspects, which can guide future model development and deployment in privacy-critical scenarios.

Key Takeaways

  • Longer context lengths in LLMs lead to decreased personalization quality and increased privacy risks.
  • The study introduces PAPerBench, a benchmark for evaluating the impact of context length on LLMs.
  • Theoretical analysis suggests that attention dilution is a limitation of current transformer models.
  • Empirical findings indicate a general scaling gap in LLM performance as context length increases.
  • The research supports reproducible evaluation and future studies on privacy and personalization in AI.

Computer Science > Machine Learning arXiv:2602.15028 (cs) [Submitted on 16 Feb 2026] Title:Long Context, Less Focus: A Scaling Gap in LLMs Revealed through Privacy and Personalization Authors:Shangding Gu View a PDF of the paper titled Long Context, Less Focus: A Scaling Gap in LLMs Revealed through Privacy and Personalization, by Shangding Gu View PDF Abstract:Large language models (LLMs) are increasingly deployed in privacy-critical and personalization-oriented scenarios, yet the role of context length in shaping privacy leakage and personalization effectiveness remains largely unexplored. We introduce a large-scale benchmark, PAPerBench, to systematically study how increasing context length influences both personalization quality and privacy protection in LLMs. The benchmark comprises approximately 29,000 instances with context lengths ranging from 1K to 256K tokens, yielding a total of 377K evaluation questions. It jointly evaluates personalization performance and privacy risks across diverse scenarios, enabling controlled analysis of long-context model behavior. Extensive evaluations across state-of-the-art LLMs reveal consistent performance degradation in both personalization and privacy as context length increases. We further provide a theoretical analysis of attention dilution under context scaling, explaining this behavior as an inherent limitation of soft attention in fixed-capacity Transformers. The empirical and theoretical findings together suggest a general s...

Related Articles

[2603.16105] Frequency Matters: Fast Model-Agnostic Data Curation for Pruning and Quantization
Llms

[2603.16105] Frequency Matters: Fast Model-Agnostic Data Curation for Pruning and Quantization

Abstract page for arXiv paper 2603.16105: Frequency Matters: Fast Model-Agnostic Data Curation for Pruning and Quantization

arXiv - AI · 4 min ·
[2603.09643] MM-tau-p$^2$: Persona-Adaptive Prompting for Robust Multi-Modal Agent Evaluation in Dual-Control Settings
Llms

[2603.09643] MM-tau-p$^2$: Persona-Adaptive Prompting for Robust Multi-Modal Agent Evaluation in Dual-Control Settings

Abstract page for arXiv paper 2603.09643: MM-tau-p$^2$: Persona-Adaptive Prompting for Robust Multi-Modal Agent Evaluation in Dual-Contro...

arXiv - AI · 4 min ·
[2603.07339] Agora: Teaching the Skill of Consensus-Finding with AI Personas Grounded in Human Voice
Llms

[2603.07339] Agora: Teaching the Skill of Consensus-Finding with AI Personas Grounded in Human Voice

Abstract page for arXiv paper 2603.07339: Agora: Teaching the Skill of Consensus-Finding with AI Personas Grounded in Human Voice

arXiv - AI · 4 min ·
[2602.00185] QUASAR: A Universal Autonomous System for Atomistic Simulation and a Benchmark of Its Capabilities
Llms

[2602.00185] QUASAR: A Universal Autonomous System for Atomistic Simulation and a Benchmark of Its Capabilities

Abstract page for arXiv paper 2602.00185: QUASAR: A Universal Autonomous System for Atomistic Simulation and a Benchmark of Its Capabilities

arXiv - AI · 4 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime