Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions site/guides/matryoshka.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Matryoshka (Adaptive-Length) Embeddings

Matryoshka embeddings are a new class of embedding models introduced in the
TODO-YYY paper [_TODO title_](https://arxiv.org/abs/2205.13147). They allow one
26 May 2022 paper titled [Matryoshka Representation Learning](https://arxiv.org/abs/2205.13147). They allow one
to truncate excess dimensions in large vector, without sacrificing much quality.

Let's say your embedding model generate 1024-dimensional vectors. If you have 1
Expand All @@ -16,7 +16,7 @@ Matryoshka embeddings, on the other hand, _can_ be truncated, without losing muc
quality. Using [`mixedbread.ai`](#TODO) `mxbai-embed-large-v1` model, they claim
that

They are called "Matryoshka" embeddings because ... TODO
They are called "Matryoshka" embeddings after the "Matryoshka dolls", also known as "Russian nesting dolls", which are a set of wooden dolls of decreasing size that are placed inside one another. In a similar way, Matryoshka embedding can store more important information in earlier dimensions, and less important information in later dimensions. See more about Matryoshka embeddings at [Hugging Face](https://huggingface.co/blog/matryoshka)

## Matryoshka Embeddings with `sqlite-vec`

Expand Down