[2602.05735] CSRv2: Unlocking Ultra-Sparse Embeddings

[2602.05735] CSRv2: Unlocking Ultra-Sparse Embeddings

arXiv - Machine Learning 4 min read

About this article

Abstract page for arXiv paper 2602.05735: CSRv2: Unlocking Ultra-Sparse Embeddings

Computer Science > Machine Learning arXiv:2602.05735 (cs) [Submitted on 5 Feb 2026 (v1), last revised 2 Mar 2026 (this version, v4)] Title:CSRv2: Unlocking Ultra-Sparse Embeddings Authors:Lixuan Guo, Yifei Wang, Tiansheng Wen, Yifan Wang, Aosong Feng, Bo Chen, Stefanie Jegelka, Chenyu You View a PDF of the paper titled CSRv2: Unlocking Ultra-Sparse Embeddings, by Lixuan Guo and 7 other authors View PDF HTML (experimental) Abstract:In the era of large foundation models, the quality of embeddings has become a central determinant of downstream task performance and overall system capability. Yet widely used dense embeddings are often extremely high-dimensional, incurring substantial costs in storage, memory, and inference latency. To address these, Contrastive Sparse Representation (CSR) is recently proposed as a promising direction, mapping dense embeddings into high-dimensional but k-sparse vectors, in contrast to compact dense embeddings such as Matryoshka Representation Learning (MRL). Despite its promise, CSR suffers severe degradation in the ultra-sparse regime, where over 80% of neurons remain inactive, leaving much of its efficiency potential unrealized. In this paper, we introduce CSRv2, a principled training approach designed to make ultra-sparse embeddings viable. CSRv2 stabilizes sparsity learning through progressive k-annealing, enhances representational quality via supervised contrastive objectives, and ensures end-to-end adaptability with full backbone finetunin...

Originally published on March 03, 2026. Curated by AI News.

Related Articles

Claude Code leak exposes a Tamagotchi-style ‘pet’ and an always-on agent | The Verge
Llms

Claude Code leak exposes a Tamagotchi-style ‘pet’ and an always-on agent | The Verge

Anthropic says “human error” resulted in a leak that exposed Claude Code’s source code. The leaked code, which has since been copied to G...

The Verge - AI · 4 min ·
You can now use ChatGPT with Apple’s CarPlay | The Verge
Llms

You can now use ChatGPT with Apple’s CarPlay | The Verge

ChatGPT is now accessible from your CarPlay dashboard if you have iOS 26.4 or newer and the latest version of the ChatGPT app.

The Verge - AI · 3 min ·
Llms

Have Companies Began Adopting Claude Co-Work at an Enterprise Level?

Hi Guys, My company is considering purchasing the Claude Enterprise plan. The main two constraints are: - Being able to block usage of Cl...

Reddit - Artificial Intelligence · 1 min ·
Llms

What I learned about multi-agent coordination running 9 specialized Claude agents

I've been experimenting with multi-agent AI systems and ended up building something more ambitious than I originally planned: a fully ope...

Reddit - Artificial Intelligence · 1 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime