imdb.load_data not returning n_words=10000 #1

JacobChrist · 2017-04-01T14:52:28Z

In this line of code:

train, test, _ = imdb.load_data(path='imdb.pkl', n_words=10000, valid_portion=0.1)

It appears to be splitting the data set into three lists: train, test and everything. Yet when I run the code it appears to train on 22500 pieces of data.

Obtaining imdb db...
numpy.shape(train)= (2, 22500)
numpy.shape(test)= (2, 2500)
numpy.shape(_)= (2, 25000)

This web page suggest that n_words maybe should be num_words but this gives an error.
https://keras.io/datasets/

I suspect this may be a bug in the tflearn library.

JacobChrist · 2017-04-01T15:53:14Z

I filed issue in tflearn repo: tflearn/tflearn#692

imWildCat · 2017-04-25T21:56:47Z

Same issue, how did you fix it?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

imdb.load_data not returning n_words=10000 #1

imdb.load_data not returning n_words=10000 #1

JacobChrist commented Apr 1, 2017

JacobChrist commented Apr 1, 2017

imWildCat commented Apr 25, 2017

imdb.load_data not returning n_words=10000 #1

imdb.load_data not returning n_words=10000 #1

Comments

JacobChrist commented Apr 1, 2017

JacobChrist commented Apr 1, 2017

imWildCat commented Apr 25, 2017