Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MIEB] Complete all model meta based on main schema #1800

Closed
Tracked by #1791
isaac-chung opened this issue Jan 14, 2025 · 2 comments
Closed
Tracked by #1791

[MIEB] Complete all model meta based on main schema #1800

isaac-chung opened this issue Jan 14, 2025 · 2 comments
Labels
mieb The image extension of MTEB

Comments

@isaac-chung
Copy link
Collaborator

isaac-chung commented Jan 14, 2025

[update] Please go over each model file from MIEB and create one PR per file to fill in the model meta.

  1. Start from the model's huggingface page
  2. Fill in as many fields as you can based on that page.
  3. If no info is given, e.g. no license, leave as None.

===========================================================
Old description:

Here are the missing fields that are present in v2.0.0 but absent in mieb:

    n_parameters: int | None = None # exists but not often filled out.
    max_tokens: float | None = None # exists but not often filled out.
    embed_dim: int | None = None # exists but not often filled out.
    license: str | None = None
    open_weights: bool | None = None      # exists as `open_source`
    public_training_data: bool | None = None # exists but not often filled out.
    public_training_code: bool | None = None # exists but not often filled out.
    framework: list[FRAMEWORKS] = []    # exists but not often filled out.
    reference: STR_URL | None = None # exists but not often filled out.
    similarity_fn_name: DISTANCE_METRICS | None = None
    use_instructions: bool | None = None
    training_datasets: dict[str, list[str]] | None = None
    adapted_from: str | None = None
    superseded_by: str | None = None
    citation: str | None = None

Here are the missing fields that are present in mieb but absent in v2.0.0:

    modalities: list[MODALITIES] = ["text"]
@isaac-chung
Copy link
Collaborator Author

There are also changes to the ModelMeta class in main such as #1794.

@isaac-chung isaac-chung added the mieb The image extension of MTEB label Jan 20, 2025
@isaac-chung isaac-chung added good first issue Good for newcomers help wanted Extra attention is needed and removed good first issue Good for newcomers help wanted Extra attention is needed labels Jan 23, 2025
@isaac-chung isaac-chung changed the title [MIEB] Complete all model meta based on v2.0.0 schema [MIEB] Complete all model meta based on main schema Feb 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mieb The image extension of MTEB
Projects
None yet
Development

No branches or pull requests

1 participant