Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to Handle Corrupted Images #123

Open
samadejacobs opened this issue Oct 24, 2017 · 2 comments
Open

How to Handle Corrupted Images #123

samadejacobs opened this issue Oct 24, 2017 · 2 comments

Comments

@samadejacobs
Copy link
Collaborator

ImageNet21k (14 million image dataset) has a number of "corrupted" images, LBANN need to provide a way to handle (possibly skip) corrupted images. Thoughts?

@ndryden
Copy link
Collaborator

ndryden commented Oct 24, 2017

How are they corrupted? Is it in the sense that the images do not load, or that they load fine but are otherwise corrupted (image artifacts, etc.)?

Incidentally, I thought ImageNet was supposed to be pretty clean, since everything is human-annotated-- is this a broadly recognized issue or maybe it's just due to our download?

@samadejacobs
Copy link
Collaborator Author

The images do not load i.e.,
LBANN: caught error message: /usr/workspace/wsa/jacobs32/lbann.git/src/data_readers/data_reader_imagenet.cpp 62ImageNet: image_utils::loadJPG failed to load - /p/lscratche/brainusr/datasets/ImageNetALL_extracted/n04257684/n04257684_9033.JPEG

Manually opening the image file (gnome-open) also failed. This happened several hours into program execution which could mean that there are probably more good than bad images.

oyamay pushed a commit to oyamay/lbann that referenced this issue Jun 16, 2020
* Clean up split

* Clean up sum

* clean up

* Bug fix
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants