[2602.17691] Tethered Reasoning: Decoupling Entropy from Hallucination in Quantized LLMs via Manifold Steering
Summary
This paper introduces HELIX, a framework to improve quantized language models by decoupling output entropy from hallucination, enhancing semantic coherence and diversity in generated outputs.
Why It Matters
As language models become increasingly prevalent in AI applications, addressing issues of hallucination and output coherence is critical. This research offers a novel approach to improve the reliability of generated content, which can enhance user trust and application effectiveness in various fields, including natural language processing and AI-driven content generation.
Key Takeaways
- HELIX framework decouples output entropy from hallucination in LLMs.
- High-temperature outputs can maintain semantic coherence through graduated steering.
- The Unified Truth Score (UTS) effectively measures output quality.
- Steering only a small percentage of layers can correct trajectory divergence.
- Multi-Temperature Synthesis can generate significantly more unique concepts.
Computer Science > Machine Learning arXiv:2602.17691 (cs) [Submitted on 6 Feb 2026] Title:Tethered Reasoning: Decoupling Entropy from Hallucination in Quantized LLMs via Manifold Steering Authors:Craig Atkinson View a PDF of the paper titled Tethered Reasoning: Decoupling Entropy from Hallucination in Quantized LLMs via Manifold Steering, by Craig Atkinson View PDF HTML (experimental) Abstract:Quantized language models face a fundamental dilemma: low sampling temperatures yield repetitive, mode-collapsed outputs, while high temperatures (T > 2.0) cause trajectory divergence and semantic incoherence. We present HELIX, a geometric framework that decouples output entropy from hallucination by tethering hidden-state trajectories to a pre-computed truthfulness manifold. HELIX computes a Unified Truth Score (UTS) combining token-level semantic entropy with Mahalanobis distance from the manifold. When UTS indicates trajectory divergence, graduated steering vectors redirect activations toward structurally coherent regions while affecting only 0.2-2.5% of tokens. On 4-bit quantized Granite 4.0 H Small (32B/9B active, hybrid Mamba-Transformer): GSM8K maintains 88.84% accuracy at T = 3.0 (2.81pp degradation from T = 0.5); MMLU maintains 72.49% across 14,042 questions (1.24pp degradation). This demonstrates that high-temperature hallucination is primarily trajectory divergence rather than semantic collapse. Notably, steering the sparse Transformer attention layers (~10% of layers) is ...