Improving Text-to-Image Generation by Discriminator with Recaption Ability
- Clone this repository
git clone https://github.com/POSTECH-IMLAB/RecapD.git
- Create conda enviroment and install all the dependencies
cd RecapD
conda create -n recapD python=3.6
conda activate recapD
pip install -r requirements.txt
- Download the preprocessed metadata for coco and save them to
datasets/
export PROJECT_DIR=~/RecapD # path for project dir
mkdir datasets
cd datasets
gdown https://drive.google.com/uc?id=1rSnbIGNDGZeHlsUlLdahj0RJ9oo6lgH9
unzip coco.zip
cd $PROJECT_DIR
- Download coco dataset and extract the images and annotations to
datasets/coco
- Download the pre-trained text encoder for coco and save it to
datasets/DAMSMencoders/coco
cd $PROJECT_DIR/datasets
mkdir DAMSMencoders
cd DAMSMencoders
gdown https://drive.google.com/uc?id=1zIrXCE9F6yfbEJIbNP5-YrEe2pZcPSGJ
unzip coco.zip
cd $PROJECT_DIR
- Build vocabulary for recaptioning model
python scripts/build_vocabularay.py \
--captions datasets/coco/annotations/captions_train2014.json \
--vocab-size 10000 \
--output-prefix datasets/vocab/coco14_10k \
--do-lower-case
python scripts/train_recapD.py
Download pretrained RecapD and save it to exps/256_cond_cap/checkpoint.pth
export PROJECT_DIR=~/RecapD # path for project dir
cd exps/256_cond_cap
gdown https://drive.google.com/uc?id=1or9fpMC6-cCVCGol39f_kOI1Vc0fZvBT
cd $PROJECT_DIR
Note that the words in the text should be in the vocabulary of DAMSM text encoder
- Generate an image from a sentence
python scripts/demo.py --text "a computer monitor next to a keyboard laptop and a mouse"
- Generate images from sentences
Add texts for generation in example_sentences.txt
python scripts/gen_recapD.py
Generated samples