[2601.21708] FBS: Modeling Native Parallel Reading inside a Transformer
About this article
Abstract page for arXiv paper 2601.21708: FBS: Modeling Native Parallel Reading inside a Transformer
Computer Science > Artificial Intelligence arXiv:2601.21708 (cs) [Submitted on 29 Jan 2026 (v1), last revised 8 Apr 2026 (this version, v2)] Title:FBS: Modeling Native Parallel Reading inside a Transformer Authors:Tongxi Wang View a PDF of the paper titled FBS: Modeling Native Parallel Reading inside a Transformer, by Tongxi Wang View PDF HTML (experimental) Abstract:Large language models (LLMs) excel across many tasks, yet inference is still dominated by strictly token-by-token autoregression. Existing acceleration methods largely patch this pipeline and miss core human-reading ingredients: content-adaptive foresight, chunk-structure-aware compute allocation, and train-test consistency for preview/skimming. We propose the Fovea-Block-Skip Transformer (FBS), which injects a causal, trainable loop into Transformers via Parafovea-Attention Window (PAW), Chunk-Head (CH), and Skip-Gate (SG). Across diverse benchmarks, FBS improves the quality-efficiency trade-off without increasing parameters, and ablations show the three modules are complementary. Comments: Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL) Cite as: arXiv:2601.21708 [cs.AI] (or arXiv:2601.21708v2 [cs.AI] for this version) https://doi.org/10.48550/arXiv.2601.21708 Focus to learn more arXiv-issued DOI via DataCite Submission history From: Tongxi Wang [view email] [v1] Thu, 29 Jan 2026 13:39:55 UTC (28,828 KB) [v2] Wed, 8 Apr 2026 10:39:08 UTC (28,831 KB) Full-text links: Access Pape...