This project is a TensorFlow eager implementation of the paper Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles. The target task is based on the classification competition of Food 101.
First of all, create a virtual environment and install all the requirements listed in requirements.txt
. We suggest to use TensorFlow GPU, while the version v | 1.14<= v < 2.0 is mandatory.
Here there is a list of the files that have to be configured in order to run the Jigsaw puzzle solver task:
relazione/paper.pdf
: contains the theory and practical description of this project (ENG);Dataset/create_dataset_h5.py
: is the file that creates the dataset that the Jigsaw puzzle solver with use for the pretext task; The project can be implemented with every kind of images, but this project has been tested with the ILSVRC2017 CLS-LOC dataset and500K
images. The number of desired images and the resource folder path can be configured inside this.py
in the very first lines. This python file will create an_el
number of images underDataset/resources/images
divided intrain
,val
andtest
sets and a.h5
file in the same directory containing the mean, std and dimension of the dataset;config.py
: is the configuration file for the Jigsaw Puzzle pretext task. In order to start the execution some parameters shall be changed:hammingSetSize
: defines the number of permutations to be used. We set it to 40 as the task should be too difficult;data_path
: this is the folder of the dataset. The root of this folder must contain thetrain
,val
andtest
images together with their.h5
description file. If you have generated the dataset withcreate_dataset_h5.py
unchanged then you don't have to change this parameter;- the other parameters are all tunable, altough this implementation has not been tested with different Jigsaw parameters.
Dataset/generate_hamming_set.py
: this file generates the hamming set that is needed for the training. Just run it to create themax_hamming_x.h5
file;- run the
main.py
file. It accepts two--mode
parameters:jps
andft
, that select which training the user want to do.