Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

imdb.load_data not returning n_words=10000 #1

Open
JacobChrist opened this issue Apr 1, 2017 · 2 comments
Open

imdb.load_data not returning n_words=10000 #1

JacobChrist opened this issue Apr 1, 2017 · 2 comments

Comments

@JacobChrist
Copy link

In this line of code:

train, test, _ = imdb.load_data(path='imdb.pkl', n_words=10000, valid_portion=0.1)

It appears to be splitting the data set into three lists: train, test and everything. Yet when I run the code it appears to train on 22500 pieces of data.

Obtaining imdb db...
numpy.shape(train)= (2, 22500)
numpy.shape(test)= (2, 2500)
numpy.shape(_)= (2, 25000)

This web page suggest that n_words maybe should be num_words but this gives an error.
https://keras.io/datasets/

I suspect this may be a bug in the tflearn library.

@JacobChrist
Copy link
Author

I filed issue in tflearn repo: tflearn/tflearn#692

@imWildCat
Copy link

Same issue, how did you fix it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants