An implementation of the paper "GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation". This contains an evaluation metric for text-to-3D generative models.
- We released 110 image prompts corresponding to the text prompts. Each image is carefully selected to align with the text. We further remove the backgrounds using rembg or Clipdrop. Download the gallery at this link.
The main dependency of this codebase is OpenAI library and PyTorch. For PyTorch installation, please refer to the official website as it highly depends on the environment. Following contains code for installation other packages:
# Instal OpenAI API
pip install --upgrade openai
# Other packages
pip install --upgrade tqdm numpy Pillow gdown
For a detailed explanation of the data format, please refer to this doc.
# TEST DATA
# 13 methods; 110 prompts; 120 uniform RGB and normal map renderings for each.
# Google Drive: https://drive.google.com/file/d/1pYmSRu_oMy_v6f7ngnkFER6PNWmJAe52/view?usp=sharing
cd data/tournament-v0
gdown "https://drive.google.com/uc?id=1pYmSRu_oMy_v6f7ngnkFER6PNWmJAe52"
unzip methods
Please find the prompts.json
file under the tournament folder (e.g. data/tournament-v0/prompts.json
).
For each prompt listed inside, use your text-to-3D generative model to create one or more shapes per prompt.
For each of these shapes, please render 120 evenly spaced views using the camera angle chosen by the Threestudio codebase.
For each render, please aim to create 512x512 resolution.
For each render in RGB, please also create its corresponding surface normal rendering.
These renders will be provided to GPT-4V.
Finally, organize the rendered images into the following folder structure:
- data/<your_method_name>/
# Prompt from zero
- <prompt-id-1>/
-<seed1>
rgb_001.png
...
rgb_119.png
normal_001.png
...
normal_119.png
...
Once we've put our data into a format our evaluation can parse, we can run the following command to obtain the ELO score placing your method among the existing tournament.
python gpt_eval_alpha.py \
--apikey <your_openai_api_key> \
--eval new_method \ # Evaluating new method
-t data/t23d-tournament-v0 \ # folder to tournament data
-m data/<your_method_name> \ # folder to method
-o results/<your_method_name> # (optional) output directory
Please organize a set of text-to-3D generative models in the following structure.
<root>
config.json
prompts.json
methods/
<method-name-1>
<prompt-id-1>
<seed-1>
rgb_0.png ...
normal_0.png ...
...
<seed-k>
...
<prompt-id-m>
...
<method-name-n>
For more information about what should be put into config.json
and prompts.json
,
please see this link.
python gpt_eval_alpha.py \
--apikey <your_openai_api_key> \
--eval tournament \ # Evaluating new method
-t <path-to-tournament-data> \ # folder to tournament data
-b 200 \ # budget (number of requests)
-o results/<tournament-name> # (optional) output directory
- More visualization and utilities tools!
- Text-to-3D Leaderboard
If you find our codebase useful for your research, please cite:
@inproceedings{wu2023gpteval3d,
author = {Tong Wu and Guandao Yang and Zhibing Li and Kai Zhang and
Ziwei Liu and Leonidas Guibas and Dahua Lin and Gordon Wetzstein},
title = {GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation},
booktitle = {CVPR},
year = {2024},
}
}
We sincerely thank the following projects including GPT-4V, threestudio, mvdream, prolificdreamer, fantasia3d, point-e, shap-e, dreamgaussian, wonder3d, syncdreamer for providing their excellent codebases!