Benchmark Granularity and Model Robustness for Image-Text Retrieval: A Reproducibility Study

This repository contains the official code for the SIGIR 2025 paper:

Benchmark Granularity and Model Robustness for Image-Text Retrieval: A Reproducibility Study

🎓 Accepted at the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2025)

We provide code, configuration files, and resources to reproduce all experiments, results, and analyses from the paper.

📁 Project Structure

├── README.md          <- The top-level README for developers using this project.
|
├── src                <- Source code for use in this project.
│   ├── __init__.py    <- Makes directory a Python module
│   │
│   ├── evaluation.py  <- Entry-point for the CLI
│   │
│   ├── models         <- Package to load models to make
│   │                     predictions
│   └── ...            <- ...
│
├── config             <- Config files to associate models and benchmarks to evaluate.
|
├── requirements.txt   <- The requirements file for reproducing the analysis environment, e.g.
│                         generated with `pip freeze > requirements.txt`

Setting Dev Environment

Create a clean virtual environment and install the requirements for this project
```
make create-venv
```
Activate virtual environment
```
source .venv/bin/activate
```
Install required Python packages:
```
pip install -r requirements.txt
```

Running the experiments

Model evaluation example

python3 src/evaluation.py --dataset f30k --model clip --task t2i --perturbation none

Example for results printer

python3 src/evaluation.py --dataset f30k --model clip --task t2i --perturbation none

Example for jaccacd similarity computation

Results and Robustness Analysis

All experimental outputs, including accuracy metrics, perturbation robustness comparisons, and plots, can be generated using the scripts in:

src/
notebooks/

You can also use:

python3 src/evaluation.py --dataset coco --model blip --task i2t --perturbation jaccard

To run model evaluation under specific robustness settings.

Citation

If you use this codebase or find our study useful in your research, please cite our paper:

@inproceedings{hendriksen2025granularity,
  title     = {Benchmark Granularity and Model Robustness for Image-Text Retrieval: A Reproducibility Study},
  author    = {Mariya Hendriksen, Shuo Zhang, Ridho Reinanda, Mohamed Yahya, Edgar Meij and Maarten de Rijke},
  booktitle = {Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval},
  year      = {2025}
}

Contact

If you have any questions or feedback, please reach out via [[email protected]] or [[email protected]].

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.github		.github
config		config
deliverables		deliverables
figures		figures
src		src
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MAINTAINERS.md		MAINTAINERS.md
README.md		README.md
SECURITY.md		SECURITY.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Benchmark Granularity and Model Robustness for Image-Text Retrieval: A Reproducibility Study

📁 Project Structure

Setting Dev Environment

Running the experiments

Model evaluation example

Example for results printer

Results and Robustness Analysis

Citation

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

License

bloomberg/evaluating-cmr-in-mm

Folders and files

Latest commit

History

Repository files navigation

Benchmark Granularity and Model Robustness for Image-Text Retrieval: A Reproducibility Study

📁 Project Structure

Setting Dev Environment

Running the experiments

Model evaluation example

Example for results printer

Results and Robustness Analysis

Citation

Contact

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages