Llms Generative Ai Data Science

[2602.15843] The Perplexity Paradox: Why Code Compresses Better Than Math in LLM Prompts

arXiv - AI February 19, 2026 3 min read Article

Summary

This article explores the 'perplexity paradox' in large language models (LLMs), demonstrating that code compresses better than mathematical prompts, and introduces a new adaptive compression algorithm.

Why It Matters

Understanding how LLMs process and compress different types of prompts is crucial for optimizing their performance. This research provides insights that could enhance code generation and reasoning tasks, making AI applications more efficient and effective.

Key Takeaways

Code prompts tolerate higher compression rates than math prompts.
A new adaptive compression algorithm (TAAC) improves efficiency while preserving quality.
The study validates findings across multiple benchmarks, enhancing the understanding of LLM capabilities.

Computer Science > Computation and Language arXiv:2602.15843 (cs) [Submitted on 21 Jan 2026] Title:The Perplexity Paradox: Why Code Compresses Better Than Math in LLM Prompts Authors:Warren Johnson View a PDF of the paper titled The Perplexity Paradox: Why Code Compresses Better Than Math in LLM Prompts, by Warren Johnson View PDF HTML (experimental) Abstract:In "Compress or Route?" (Johnson, 2026), we found that code generation tolerates aggressive prompt compression (r >= 0.6) while chain-of-thought reasoning degrades gradually. That study was limited to HumanEval (164 problems), left the "perplexity paradox" mechanism unvalidated, and provided no adaptive algorithm. This paper addresses all three gaps. First, we validate across six code benchmarks (HumanEval, MBPP, HumanEval+, MultiPL-E) and four reasoning benchmarks (GSM8K, MATH, ARC-Challenge, MMLU-STEM), confirming the compression threshold generalizes across languages and difficulties. Second, we conduct the first per-token perplexity analysis (n=723 tokens), revealing a "perplexity paradox": code syntax tokens are preserved (high perplexity) while numerical values in math problems are pruned despite being task-critical (low perplexity). Signature injection recovers +34 percentage points in pass rate (5.3% to 39.3%; Cohen's h=0.890). Third, we propose TAAC (Task-Aware Adaptive Compression), achieving 22% cost reduction with 96% quality preservation, outperforming fixed-ratio compression by 7%. MBPP validation (n=1,8...

Read Original Article

[2602.15843] The Perplexity Paradox: Why Code Compresses Better Than Math in LLM Prompts

Summary

Why It Matters

Key Takeaways

Related Articles

How to use the new ChatGPT app integrations, including DoorDash, Spotify, Uber, and others | TechCrunch

Anthropic Restricts Claude Agent Access Amid AI Automation Boom in Crypto

Is cutting ‘please’ when talking to ChatGPT better for the planet? An expert explains

AI Desktop 98 lets you chat with Claude, ChatGPT, and Gemini through a Windows 98-inspired interface

No comments

Stay updated with AI News