[2504.16956] GeneMamba: An Efficient and Effective Foundation Model on Single Cell Data
About this article
Abstract page for arXiv paper 2504.16956: GeneMamba: An Efficient and Effective Foundation Model on Single Cell Data
Computer Science > Computation and Language arXiv:2504.16956 (cs) [Submitted on 22 Apr 2025 (v1), last revised 23 Mar 2026 (this version, v4)] Title:GeneMamba: An Efficient and Effective Foundation Model on Single Cell Data Authors:Cong Qi, Hanzhang Fang, Siqi Jiang, Xun Song, Tianxing Hu, Wei Zhi View a PDF of the paper titled GeneMamba: An Efficient and Effective Foundation Model on Single Cell Data, by Cong Qi and 5 other authors View PDF HTML (experimental) Abstract:Single-cell RNA sequencing (scRNA-seq) enables high-resolution analysis of cellular heterogeneity, but its complexity, which is marked by high dimensionality, sparsity, and batch effects, which poses major computational challenges. Transformer-based models have made significant advances in this domain but are often limited by their quadratic complexity and suboptimal handling of long-range dependencies. In this work, we introduce GeneMamba, a scalable and efficient foundation model for single-cell transcriptomics built on state space modeling. Leveraging the Bi-Mamba architecture, GeneMamba captures bidirectional gene context with linear-time complexity, offering substantial computational gains over transformer baselines. The model is pretrained on nearly 30 million cells and incorporates biologically informed objectives, including pathway-aware contrastive loss and rank-based gene encoding. We evaluate GeneMamba across diverse tasks, including multi-batch integration, cell type annotation, and gene-gene co...