The MTEB Leaderboard is available here. To submit to it:
- Add meta information about your model to model dir. See the docstring of ModelMeta for meta data details.
To calculate
from mteb.model_meta import ModelMeta bge_m3 = ModelMeta( name="model_name", languages=["model_languages"], # in format eng-Latn open_weights=True, revision="5617a9f61b028005a4858fdac845db406aefb181", release_date="2024-06-28", n_parameters=568_000_000, memory_usage_mb=2167, embed_dim=4096, license="mit", max_tokens=8194, reference="https://huggingface.co/BAAI/bge-m3", similarity_fn_name="cosine", framework=["Sentence Transformers", "PyTorch"], use_instructions=False, public_training_code=None, public_training_data="https://huggingface.co/datasets/cfli/bge-full-data", training_datasets={"your_dataset": ["train"]}, )
memory_usage_mb
you can runmodel_meta.memory_usage_mb()
. By default, the model will run using thesentence_transformers_loader
loader function. If you need to use a custom implementation, you can specify theloader
parameter in theModelMeta
class. For example:Then you can specify thefrom mteb.models.wrapper import Wrapper from mteb.encoder_interface import PromptType import numpy as np class CustomWrapper(Wrapper): def __init__(self, model_name, model_revision): super().__init__(model_name, model_revision) # your custom implementation here def encode( self, sentences: list[str], *, task_name: str, prompt_type: PromptType | None = None, **kwargs ) -> np.ndarray: # your custom implementation here return np.zeros((len(sentences), self.embed_dim))
loader
parameter in theModelMeta
class:your_model = ModelMeta( loader=partial( CustomWrapper, model_name="model_name", model_revision="5617a9f61b028005a4858fdac845db406aefb181" ), ... )
- Run the desired model on MTEB:
Either use the Python API:
import mteb
# load a model from the hub (or for a custom implementation see https://github.com/embeddings-benchmark/mteb/blob/main/docs/reproducible_workflow.md)
model = mteb.get_model("sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2")
tasks = mteb.get_tasks(...) # get specific tasks
# or
tasks = mteb.get_benchmark("MTEB(eng, classic)") # or use a specific benchmark
evaluation = mteb.MTEB(tasks=tasks)
evaluation.run(model, output_folder="results")
Or using the command line interface:
mteb run -m {model_name} -t {task_names}
These will save the results in a folder called results/{model_name}/{model_revision}
.
- Push Results to the Leaderboard
To add results to the public leaderboard you can push your results to the results repository via a PR. Once merged they will appear on the leaderboard after a day.
- Wait for a refresh the leaderboard
Notes:
If your model uses Sentence Transformers and requires different prompts for encoding the queries and corpus, you can take advantage of the prompts
parameter.
Internally, mteb
uses query
for encoding the queries and passage
as the prompt names for encoding the corpus. This is aligned with the default names used by Sentence Transformers.
You can directly add the prompts when saving and uploading your model to the Hub. For an example, refer to this configuration file. These prompts can then be specified in the ModelMeta object.
model = ModelMeta(
loader=partial( # type: ignore
sentence_transformers_loader,
model_name="intfloat/multilingual-e5-small",
revision="fd1525a9fd15316a2d503bf26ab031a61d056e98",
model_prompts={
"query": "query: ",
"passage": "passage: ",
},
),
)
If you are unable to directly add the prompts in the model configuration, you can instantiate the model using the sentence_transformers_loader
and pass prompts
as an argument. For more details, see the mteb/models/bge_models.py
file.
Models that use instructions can use the InstructSentenceTransformerWrapper
. For example:
model = ModelMeta(
loader=partial(
InstructSentenceTransformerWrapper,
model="nvidia/NV-Embed-v1",
revision="7604d305b621f14095a1aa23d351674c2859553a",
instruction_template="Instruct: {instruction}\nQuery: ",
),
...
)