🪆 Introduction to Matryoshka Embedding Models

Hugging Face Blog February 15, 2026 11 min read

About this article

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Back to Articles 🪆 Introduction to Matryoshka Embedding Models Published February 23, 2024 Update on GitHub Upvote 193 +187 Tom Aarsen tomaarsen Follow Joshua Xenova Follow Omar Sanseviero osanseviero Follow In this blogpost, we will introduce you to the concept of Matryoshka Embeddings and explain why they are useful. We will discuss how these models are theoretically trained and how you can train them using Sentence Transformers. Additionally, we will provide practical guidance on how to use Matryoshka Embedding models and share a comparison between a Matryoshka embedding model and a regular embedding model. Finally, we invite you to check out our interactive demo that showcases the power of these models. Table of Contents Understanding Embeddings 🪆 Matryoshka Embeddings 🪆 Matryoshka Dolls Why would you use 🪆 Matryoshka Embedding models? How are 🪆 Matryoshka Embedding models trained? Theoretically In Sentence Transformers How do I use 🪆 Matryoshka Embedding models? Theoretically In Sentence Transformers Results Demo References Understanding Embeddings Embeddings are one of the most versatile tools in natural language processing, enabling practitioners to solve a large variety of tasks. In essence, an embedding is a numerical representation of a more complex object, like text, images, audio, etc. The embedding model will always produce embeddings of the same fixed size. You can then compute the similarity of complex objects by computing the similarity of the respective em...

Originally published on February 15, 2026. Curated by AI News.

Open Source Ai

Granite 4.0 3B Vision: Compact Multimodal Intelligence for Enterprise Documents

A Blog post by IBM Granite on Hugging Face

Hugging Face Blog · 7 min · about 7 hours ago

Llms

My AI spent last night modifying its own codebase

I've been working on a local AI system called Apis that runs completely offline through Ollama. During a background run, Apis identified ...

Reddit - Artificial Intelligence · 1 min · about 12 hours ago

Llms

Depth-first pruning seems to transfer from GPT-2 to Llama (unexpectedly well)

TL;DR: Removing the right transformer layers (instead of shrinking all layers) gives smaller, faster models with minimal quality loss — a...

Reddit - Artificial Intelligence · 1 min · about 14 hours ago

Llms

[2603.16430] EngGPT2: Sovereign, Efficient and Open Intelligence

Abstract page for arXiv paper 2603.16430: EngGPT2: Sovereign, Efficient and Open Intelligence

arXiv - AI · 4 min · about 15 hours ago

🪆 Introduction to Matryoshka Embedding Models

About this article

Related Articles

Granite 4.0 3B Vision: Compact Multimodal Intelligence for Enterprise Documents

My AI spent last night modifying its own codebase

Depth-first pruning seems to transfer from GPT-2 to Llama (unexpectedly well)

[2603.16430] EngGPT2: Sovereign, Efficient and Open Intelligence

No comments

Stay updated with AI News