Image Captioning

Image Captioning

Image Captioning is the process of generating textual description of an image. It uses both Natural Language Processing and Computer Vision to generate the captions. This work (Mohammad Mohammadifar's master thesis) makes captions for images in Persian.

Example 1	Example 2

کودکی با لباس آبی در حال بازی با توپ قرمز است	دختربچه ای در حال بازی است

Requirements

Install requirements by

pip install -r requirements.txt

Dataset

This work uses the Flicker8 Dataset which is available here. You can donwload and unzip it by command bellow:

wget https://github.com/jbrownlee/Datasets/releases/download/Flickr8k/Flickr8k_Dataset.zip
unzip Flickr8k_Dataset.zip

1. Pre-processing

In this section three steps of pre-processing are needed which are as fallows

Feature Extraction

First part will extract features from Flicker8 dataset and save them into a features.pkl file

python 1_feature_exctract.py

Text Preparation

Second part will prepare texts from the farsi_8k_human.txt and save them into a descriptions.txt file

python 1_text_prep.py

Tokenizer

Third part will train a tokenizer based on train set image descriptions and save it into a tokenizer.pkl

python 1_tokenizer.py

2. Training

In this section we train our Persion image captioning model based on features.pkl, descriptions.txt and tokenizer.pkl. This process may take a while. At the end it will make some files like model-ep*-loss*-val_loss*-attention-final.h5 and each of them can be used for evaluation section. You should rename your preferred one to model.h5. You can either downlowad our pretrianed model from here.

python 2_train_nic2.py

3. Evaluation and test

In this section you can evaluate the trained model and then test it on any given image

Evaluation

This part will evaluate the model on test data using BLEU score

python 3_eval.py

Test

You can get the caption of your images using bellow command

python test.py [path_to_image]

For example for bellow picture we should have

$ python test.py test.jpg
startseq یک زن در حال عکس گرفتن از یک صخره بزرگ است endseq

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image Captioning

Requirements

Dataset

1. Pre-processing

Feature Extraction

Text Preparation

Tokenizer

2. Training

3. Evaluation and test

Evaluation

Test

About

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
.gitignore		.gitignore
1_feature_exctract.py		1_feature_exctract.py
1_text_prep.py		1_text_prep.py
1_tokenizer.py		1_tokenizer.py
2_train_nic2.py		2_train_nic2.py
3_eval.py		3_eval.py
README.md		README.md
farsi_8k_human.txt		farsi_8k_human.txt
image_captioning.ipynb		image_captioning.ipynb
requirements.txt		requirements.txt
test.jpg		test.jpg
test.py		test.py
test.txt		test.txt
train.txt		train.txt

Sharif-SLPL/image-captioning

Folders and files

Latest commit

History

Repository files navigation

Image Captioning

Requirements

Dataset

1. Pre-processing

Feature Extraction

Text Preparation

Tokenizer

2. Training

3. Evaluation and test

Evaluation

Test

About

Topics

Resources

Stars

Watchers

Forks

Languages