[Benchmark] Support MMESCI #1328

JCruan519 · 2025-11-26T08:05:26Z

This PR integrates MME-SCI into VLMEvalKit

About MME-SCI

MME-SCI is a comprehensive and challenging multimodal scientific benchmark consisting of 1,019 manually curated question-answer pairs, covering four subjects (mathematics, physics, chemistry, biology), five languages (Chinese, English, French, Spanish, Japanese), and three input modalities (text-only, image-only, image-text hybrid), with 63 fine-grained knowledge points, designed to assess the scientific reasoning capabilities of multimodal large language models and effectively reveal their weaknesses.

Changes

vlmeval/dataset/mmesci.py - MMESCI dataset implementation with automatic HuggingFace download
vlmeval/dataset/__init__.py - Register MMESCI dataset classes
vlmeval/inference.py - Introduce the force_use_dataset_prompt parameter to enforce the use of the dataset-side build_prompt.

Supported Datasets

MMESCI_VisionOnly
MMESCI_ZH
MMESCI_EN
MMESCI_FR
MMESCI_ES
MMESCI_JA

Citation

@article{ruan2025mme,
  title={Mme-sci: A comprehensive and challenging science benchmark for multimodal large language models},
  author={Ruan, Jiacheng and Jiang, Dan and Gao, Xian and Liu, Ting and Fu, Yuzhuo and Kang, Yangyang},
  journal={arXiv preprint arXiv:2508.13938},
  year={2025}
}

mzr1996 · 2025-11-28T10:45:46Z

Do we really need force_use_dataset_prompt?
Since the dataset_name is passed into the use_custom_prompt, it should be the model's responsibility to choose the appropriate prompt for the specified dataset, instead of using the flag to restrict the model.

add_MMESCI

27fa164

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Benchmark] Support MMESCI #1328

[Benchmark] Support MMESCI #1328

Uh oh!

JCruan519 commented Nov 26, 2025

Uh oh!

mzr1996 commented Nov 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[Benchmark] Support MMESCI #1328

Are you sure you want to change the base?

[Benchmark] Support MMESCI #1328

Uh oh!

Conversation

JCruan519 commented Nov 26, 2025

This PR integrates MME-SCI into VLMEvalKit

About MME-SCI

Changes

Supported Datasets

Citation

Uh oh!

mzr1996 commented Nov 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants