[2603.03806] Separators in Enhancing Autoregressive Pretraining for

[2603.03806] Separators in Enhancing Autoregressive Pretraining for Vision Mamba

arXiv - AI March 05, 2026 3 min read

About this article

Abstract page for arXiv paper 2603.03806: Separators in Enhancing Autoregressive Pretraining for Vision Mamba

Computer Science > Computer Vision and Pattern Recognition arXiv:2603.03806 (cs) [Submitted on 4 Mar 2026] Title:Separators in Enhancing Autoregressive Pretraining for Vision Mamba Authors:Hanpeng Liu, Zidan Wang, Shuoxi Zhang, Kaiyuan Gao, Kun He View a PDF of the paper titled Separators in Enhancing Autoregressive Pretraining for Vision Mamba, by Hanpeng Liu and 4 other authors View PDF HTML (experimental) Abstract:The state space model Mamba has recently emerged as a promising paradigm in computer vision, attracting significant attention due to its efficient processing of long sequence tasks. Mamba's inherent causal mechanism renders it particularly suitable for autoregressive pretraining. However, current autoregressive pretraining methods are constrained to short sequence tasks, failing to fully exploit Mamba's prowess in handling extended sequences. To address this limitation, we introduce an innovative autoregressive pretraining method for Vision Mamba that substantially extends the input sequence length. We introduce new \textbf{S}epara\textbf{T}ors for \textbf{A}uto\textbf{R}egressive pretraining to demarcate and differentiate between different images, known as \textbf{STAR}. Specifically, we insert identical separators before each image to demarcate its inception. This strategy enables us to quadruple the input sequence length of Vision Mamba while preserving the original dimensions of the dataset images. Employing this long sequence pretraining technique, our ST...

Originally published on March 05, 2026. Curated by AI News.

Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min · about 1 hour ago

Llms

built an open source tool that auto generates AI context files for any codebase, 150 stars in

one of the most tedious parts of working with AI coding tools is having to manually write context files every single time. CLAUDE.md, .cu...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Machine Learning

[R] First open-source implementation of Hebbian fast-weight write-back for the BDH architecture

The BDH (Dragon Hatchling) paper (arXiv:2509.26507) describes a Hebbian synaptic plasticity mechanism where model weights update during i...

Reddit - Machine Learning · 1 min · about 2 hours ago

Llms

[R] A language model built from the damped harmonic oscillator equation — no transformer blocks

I've been building a neural architecture where the only learnable transform is the transfer function of a damped harmonic oscillator: H(ω...

Reddit - Machine Learning · 1 min · about 4 hours ago

[2603.03806] Separators in Enhancing Autoregressive Pretraining for Vision Mamba

About this article

Related Articles

UMKC Announces New Master of Science in Artificial Intelligence

built an open source tool that auto generates AI context files for any codebase, 150 stars in

[R] First open-source implementation of Hebbian fast-weight write-back for the BDH architecture

[R] A language model built from the damped harmonic oscillator equation — no transformer blocks

No comments

Stay updated with AI News