[2601.04854] Projected Autoregression: Autoregressive Language Generation in Continuous State Space
About this article
Abstract page for arXiv paper 2601.04854: Projected Autoregression: Autoregressive Language Generation in Continuous State Space
Computer Science > Computation and Language arXiv:2601.04854 (cs) [Submitted on 8 Jan 2026 (v1), last revised 6 Apr 2026 (this version, v3)] Title:Projected Autoregression: Autoregressive Language Generation in Continuous State Space Authors:Oshri Naparstek View a PDF of the paper titled Projected Autoregression: Autoregressive Language Generation in Continuous State Space, by Oshri Naparstek View PDF HTML (experimental) Abstract:Standard autoregressive language models generate text by repeatedly selecting a discrete next token, coupling prediction with irreversible commitment at every step. We show that token selection is not the only viable autoregressive interface. \textbf{Projected Autoregression} replaces token selection with continuous prediction in embedding space followed by discrete projection at commitment time. The model predicts next-token vectors via regression and contrastive objectives, while discrete tokens arise only by nearest-neighbor projection. An optional mutable suffix (``liquid tail'') enables iterative refinement before commitment, but the central change is more basic: next-step prediction is continuous, and discrete tokens are produced only as a downstream interface. Projected Autoregression establishes a concrete alternative to token-selection autoregression: language generation can be organized around continuous-state prediction with delayed discrete commitment. Refinement remains local to a short causal suffix within a left-to-right causal proc...