Block Sparse Matrices for Smaller and Faster Language Models

Hugging Face Blog February 15, 2026 4 min read

About this article

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Back to Articles Block Sparse Matrices for Smaller and Faster Language Models Published September 10, 2020 Update on GitHub Upvote - François Lagunas madlag Follow Saving space and time, one zero at a time In previous blog posts we introduced sparse matrices and what they could do to improve neural networks. The basic assumption is that full dense layers are often overkill and can be pruned without a significant loss in precision. In some cases sparse linear layers can even improve precision or/and generalization. The main issue is that currently available code that supports sparse algebra computation is severely lacking efficiency. We are also still waiting for official PyTorch support. That's why we ran out of patience and took some time this summer to address this "lacuna". Today, we are excited to release the extension pytorch_block_sparse. By itself, or even better combined with other methods like distillation and quantization, this library enables networks which are both smaller and faster, something Hugging Face considers crucial to let anybody use neural networks in production at low cost, and to improve the experience for the end user. Usage The provided BlockSparseLinear module is a drop in replacement for torch.nn.Linear, and it is trivial to use it in your models: # from torch.nn import Linear from pytorch_block_sparse import BlockSparseLinear ... # self.fc = nn.Linear(1024, 256) self.fc = BlockSparseLinear(1024, 256, density=0.1) The extension also provides a ...

Originally published on February 15, 2026. Curated by AI News.

Llms

[R] BraiNN: An Experimental Neural Architecture with Working Memory, Relational Reasoning, and Adaptive Learning

BraiNN An Experimental Neural Architecture with Working Memory, Relational Reasoning, and Adaptive Learning BraiNN is a compact research‑...

Reddit - Machine Learning · 1 min · 25 minutes ago

Llms

We hit 150 stars on our AI setup tool!

yo folks, we just hit 150 stars on our open source tool that auto makes AI context files. got 90 PRs merged and 20 issues that ppl are pi...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Llms

Is ai getting dummer?

Over the past month, it feels like GPT and Gemini have been giving wrong answers a lot. Do you feel the same, or am I exaggerating? submi...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Llms

If AI is really making us more productive... why does it feel like we are working more, not less...?

The promise of AI was the ultimate system optimisation: Efficiency. On paper, the tools are delivering something similar to what they pro...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Block Sparse Matrices for Smaller and Faster Language Models

About this article

Related Articles

[R] BraiNN: An Experimental Neural Architecture with Working Memory, Relational Reasoning, and Adaptive Learning

We hit 150 stars on our AI setup tool!

Is ai getting dummer?

If AI is really making us more productive... why does it feel like we are working more, not less...?

No comments

Stay updated with AI News