[2512.15987] Provably Extracting the Features from a General Superposition

[2512.15987] Provably Extracting the Features from a General Superposition

arXiv - AI 4 min read

About this article

Abstract page for arXiv paper 2512.15987: Provably Extracting the Features from a General Superposition

Computer Science > Machine Learning arXiv:2512.15987 (cs) [Submitted on 17 Dec 2025 (v1), last revised 31 Mar 2026 (this version, v2)] Title:Provably Extracting the Features from a General Superposition Authors:Allen Liu View a PDF of the paper titled Provably Extracting the Features from a General Superposition, by Allen Liu View PDF HTML (experimental) Abstract:It is widely believed that complex machine learning models generally encode features through linear representations. This is the foundational hypothesis behind a vast body of work on interpretability. A key challenge toward extracting interpretable features, however, is that they exist in superposition. In this work, we study the question of extracting features in superposition from a learning theoretic perspective. We start with the following fundamental setting: we are given query access to a function \[ f(x)=\sum_{i=1}^n \sigma_i(v_i^\top x), \] where each unit vector $v_i$ encodes a feature direction and $\sigma_i:\R\to\R$ is an arbitrary response function and our goal is to recover the $v_i$ and the function $f$. In learning-theoretic terms, superposition refers to the \emph{overcomplete regime}, when the number of features is larger than the underlying dimension (i.e. $n > d$), which has proven especially challenging for typical algorithmic approaches. Our main result is an efficient query algorithm that, from noisy oracle access to $f$, identifies all feature directions whose responses are non-degenerate an...

Originally published on April 01, 2026. Curated by AI News.

Related Articles

Machine Learning

ICML 2026 - Heavy score variance among various batches? [D]

I've seen some people say in their batch very few papers have above 3.5 score, but then other reviewers say that most papers in their sco...

Reddit - Machine Learning · 1 min ·
Machine Learning

We’re proud to open-source LIDARLearn [R] [D] [P]

It’s a unified PyTorch library for 3D point cloud deep learning. To our knowledge, it’s the first framework that supports such a large co...

Reddit - Machine Learning · 1 min ·
Llms

I built a repo for implementing and training LLM architectures from scratch in minimal PyTorch — contributions welcome! [P]

Hey everyone, I've been working on a repo where I implement large language model architectures using the simplest PyTorch code possible. ...

Reddit - Machine Learning · 1 min ·
Llms

I built a repo for implementing and training LLM architectures from scratch in minimal PyTorch — contributions welcome! [P]

Hey everyone, I've been working on a repo where I implement large language model architectures using the simplest PyTorch code possible. ...

Reddit - Machine Learning · 1 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime