Skip to content

Latest commit

 

History

History
13 lines (7 loc) · 562 Bytes

README.md

File metadata and controls

13 lines (7 loc) · 562 Bytes

image_captioning_flickr

In this project, we worked on both, the flickr_8k and flickr_30k, datasets but we had some storage and runtime complications with the flickr_30k dataset.

We used the encoder-decoder model to create our image caption generator, with the encoder as a CNN network and the decoder as an LSTM network.

Datasets can be found here:

flickr_8k: https://www.kaggle.com/datasets/waelboussbat/flickr8ksau

flickr_30k: https://www.kaggle.com/datasets/hsankesara/flickr-image-dataset

More details can be found in the report and/or presentation.