Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataSet Could not be found #2

Open
Vrishod opened this issue Apr 5, 2019 · 3 comments
Open

DataSet Could not be found #2

Vrishod opened this issue Apr 5, 2019 · 3 comments

Comments

@Vrishod
Copy link

Vrishod commented Apr 5, 2019

Couldnot find the dataset used for training

@yonkshi
Copy link
Owner

yonkshi commented Apr 12, 2019

Hi, the dataset we used were:

  1. Pretrained Imagenet
  2. Oxford-102 flowers
  3. Oxford-102 description: https://github.com/reedscot/cvpr2016 (The author of the original paper used Amazon Mechanical Turks to transcribe the Oxford 102 flowers)

@SreenijaK
Copy link

@yonkshi Incase I want to work for my own dataset how do i do the embeddings. How do i convert the textfiles to .t7 format?

@yonkshi
Copy link
Owner

yonkshi commented Jul 24, 2019

@SreenijaK You may check out the original GAN-CLS paper, the author mentioned that you can either pretrain the text embedding or train it end to end with the GAN. To pretrain the text embedding you will need to train the text encoder using the pretrained LeNet's embedding layer. LeNet already contains a well defined embedding for images, and you want train your text embedding to be as close to that as possible.

As for the .t7 file, it's simply a torch dataset file (kind of like an h5 file if you are familiar), you can write a custom dataloader if you plan on using t7, I am not using t7 file as I would need to install torch for that to work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants