[2603.02952] Sparse autoencoders reveal organized biological knowledge

[2603.02952] Sparse autoencoders reveal organized biological knowledge but minimal regulatory logic in single-cell foundation models: a comparative atlas of Geneformer and scGPT

arXiv - Machine Learning March 04, 2026 4 min read

About this article

Abstract page for arXiv paper 2603.02952: Sparse autoencoders reveal organized biological knowledge but minimal regulatory logic in single-cell foundation models: a comparative atlas of Geneformer and scGPT

Quantitative Biology > Genomics arXiv:2603.02952 (q-bio) [Submitted on 3 Mar 2026] Title:Sparse autoencoders reveal organized biological knowledge but minimal regulatory logic in single-cell foundation models: a comparative atlas of Geneformer and scGPT Authors:Ihor Kendiukhov View a PDF of the paper titled Sparse autoencoders reveal organized biological knowledge but minimal regulatory logic in single-cell foundation models: a comparative atlas of Geneformer and scGPT, by Ihor Kendiukhov View PDF HTML (experimental) Abstract:Background: Single-cell foundation models such as Geneformer and scGPT encode rich biological information, but whether this includes causal regulatory logic rather than statistical co-expression remains unclear. Sparse autoencoders (SAEs) can resolve superposition in neural networks by decomposing dense activations into interpretable features, yet they have not been systematically applied to biological foundation models. Results: We trained TopK SAEs on residual stream activations from all layers of Geneformer V2-316M (18 layers, d=1152) and scGPT whole-human (12 layers, d=512), producing atlases of 82525 and 24527 features, respectively. Both atlases confirm massive superposition, with 99.8 percent of features invisible to SVD. Systematic characterization reveals rich biological organization: 29 to 59 percent of features annotate to Gene Ontology, KEGG, Reactome, STRING, or TRRUST, with U-shaped layer profiles reflecting hierarchical abstraction. Fea...

Originally published on March 04, 2026. Curated by AI News.

Llms

persistent memory system for AI agents — single SQLite file, no external server, no API keys. free and opensource - BrainCTL

Every agent I build forgets everything between sessions. I got tired of it and built brainctl. pip install brainctl, then: from agentmemo...

Reddit - Artificial Intelligence · 1 min · about 2 hours ago

Llms

How has Claude far surpassed the competitors? They were not first to market or ever had the most cash yet their feature are far and away the best on the market.

How has Claude far surpassed the competitors? They were not first to market or ever had the most cash yet their feature are far and away ...

Reddit - Artificial Intelligence · 1 min · about 2 hours ago

Llms

Anthropic temporarily banned OpenClaw's creator from accessing Claude | TechCrunch

This ban took place after Claude's pricing changed for OpenClaw users last week.

TechCrunch - AI · 5 min · about 4 hours ago

Llms

I probably shouldn't be impressed, but I am.

So I just made this workout on a whiteboard and I was feeling lazy so I asked Claude to read it. And it did, almost flawlessly. I was and...

Reddit - Artificial Intelligence · 1 min · about 4 hours ago

[2603.02952] Sparse autoencoders reveal organized biological knowledge but minimal regulatory logic in single-cell foundation models: a comparative atlas of Geneformer and scGPT

About this article

Related Articles

persistent memory system for AI agents — single SQLite file, no external server, no API keys. free and opensource - BrainCTL

How has Claude far surpassed the competitors? They were not first to market or ever had the most cash yet their feature are far and away the best on the market.

Anthropic temporarily banned OpenClaw's creator from accessing Claude | TechCrunch

I probably shouldn't be impressed, but I am.

No comments

Stay updated with AI News