[2601.05724] Overcoming Joint Intractability with Lossless Hierarchical Speculative Decoding
About this article
Abstract page for arXiv paper 2601.05724: Overcoming Joint Intractability with Lossless Hierarchical Speculative Decoding
Computer Science > Artificial Intelligence arXiv:2601.05724 (cs) [Submitted on 9 Jan 2026 (v1), last revised 2 Mar 2026 (this version, v2)] Title:Overcoming Joint Intractability with Lossless Hierarchical Speculative Decoding Authors:Yuxuan Zhou, Fei Huang, Heng Li, Fengyi Wu, Tianyu Wang, Jianwei Zhang, Junyang Lin, Zhi-Qi Cheng View a PDF of the paper titled Overcoming Joint Intractability with Lossless Hierarchical Speculative Decoding, by Yuxuan Zhou and 7 other authors View PDF HTML (experimental) Abstract:Verification is a key bottleneck in improving inference speed while maintaining distribution fidelity in Speculative Decoding. Recent work has shown that sequence-level verification leads to a higher number of accepted tokens compared to token-wise verification. However, existing solutions often rely on surrogate approximations or are constrained by partial information, struggling with joint intractability. In this work, we propose Hierarchical Speculative Decoding (HSD), a provably lossless verification method that significantly boosts the expected number of accepted tokens and overcomes joint intractability by balancing excess and deficient probability mass across accessible branches. Our extensive large-scale experiments demonstrate that HSD yields consistent improvements in acceptance rates across diverse model families and benchmarks. Moreover, its strong explainability and generality make it readily integrable into a wide range of speculative decoding framewor...