[2603.27027] TAPS: Task Aware Proposal Distributions for Speculative Sampling
About this article
Abstract page for arXiv paper 2603.27027: TAPS: Task Aware Proposal Distributions for Speculative Sampling
Computer Science > Computation and Language arXiv:2603.27027 (cs) [Submitted on 27 Mar 2026] Title:TAPS: Task Aware Proposal Distributions for Speculative Sampling Authors:Mohamad Zbib, Mohamad Bazzi, Ammar Mohanna, Hasan Abed Al Kader Hammoud, Bernard Ghanem View a PDF of the paper titled TAPS: Task Aware Proposal Distributions for Speculative Sampling, by Mohamad Zbib and 4 other authors View PDF Abstract:Speculative decoding accelerates autoregressive generation by letting a lightweight draft model propose future tokens that a larger target model then verifies in parallel. In practice, however, draft models are usually trained on broad generic corpora, which leaves it unclear how much speculative decoding quality depends on the draft training distribution. We study this question with lightweight HASS and EAGLE-2 drafters trained on MathInstruct, ShareGPT, and mixed-data variants, evaluated on MT-Bench, GSM8K, MATH-500, and SVAMP. Measured by acceptance length, task-specific training yields clear specialization: MathInstruct-trained drafts are strongest on reasoning benchmarks, while ShareGPT-trained drafts are strongest on MT-Bench. Mixed-data training improves robustness, but larger mixtures do not dominate across decoding temperatures. We also study how to combine specialized drafters at inference time. Naive checkpoint averaging performs poorly, whereas confidence-based routing improves over single-domain drafts and merged-tree verification yields the highest accepta...