[2603.02908] SAE as a Crystal Ball: Interpretable Features Predict

[2603.02908] SAE as a Crystal Ball: Interpretable Features Predict Cross-domain Transferability of LLMs without Training

arXiv - AI March 04, 2026 4 min read

About this article

Abstract page for arXiv paper 2603.02908: SAE as a Crystal Ball: Interpretable Features Predict Cross-domain Transferability of LLMs without Training

Computer Science > Artificial Intelligence arXiv:2603.02908 (cs) [Submitted on 3 Mar 2026] Title:SAE as a Crystal Ball: Interpretable Features Predict Cross-domain Transferability of LLMs without Training Authors:Qi Zhang, Yifei Wang, Xiaohan Wang, Jiajun Chai, Guojun Yin, Wei Lin, Yisen Wang View a PDF of the paper titled SAE as a Crystal Ball: Interpretable Features Predict Cross-domain Transferability of LLMs without Training, by Qi Zhang and 6 other authors View PDF HTML (experimental) Abstract:In recent years, pre-trained large language models have achieved remarkable success across diverse tasks. Besides the pivotal role of self-supervised pre-training, their effectiveness in downstream applications also depends critically on the post-training process, which adapts models to task-specific data and objectives. However, this process inevitably introduces model shifts that can influence performance in different domains, and how such shifts transfer remains poorly understood. To open up the black box, we propose the SAE-based Transferability Score (STS), a new metric that leverages sparse autoencoders (SAEs) to forecast post-training transferability. Taking supervised fine-tuning as an example, STS identifies shifted dimensions in SAE representations and calculates their correlations with downstream domains, enabling reliable estimation of transferability \textit{before} fine-tuning. Extensive experiments across multiple models and domains show that STS accurately predic...

Originally published on March 04, 2026. Curated by AI News.

Llms

8 free AI courses from Anthropic’s Claude platform with certificates

AI News - General · about 1 hour ago

Llms

Gemini gets major upgrade towards interactive AI learning

AI News - General · 3 min · about 2 hours ago

Llms

Anthropic launches Claude Managed Agents — composable APIs for shipping production AI agents 10x faster. Notion, Rakuten, Asana, and Sentry already in production.

Anthropic launches Claude Managed Agents in public beta — composable APIs for shipping production AI agents 10x faster Handles sandboxing...

Reddit - Artificial Intelligence · 1 min · about 4 hours ago

Llms

6 Months Using AI for Actual Work: What's Incredible, What's Overhyped, and What's Quietly Dangerous

Six months ago I committed to using AI tools for everything I possibly could in my work. Every day, every task, every workflow. Here's th...

Reddit - Artificial Intelligence · 1 min · about 4 hours ago

[2603.02908] SAE as a Crystal Ball: Interpretable Features Predict Cross-domain Transferability of LLMs without Training

About this article

Related Articles

8 free AI courses from Anthropic’s Claude platform with certificates

Gemini gets major upgrade towards interactive AI learning

Anthropic launches Claude Managed Agents — composable APIs for shipping production AI agents 10x faster. Notion, Rakuten, Asana, and Sentry already in production.

6 Months Using AI for Actual Work: What's Incredible, What's Overhyped, and What's Quietly Dangerous

No comments

Stay updated with AI News