Image Dataset

This is a list of image dataset available to free of charge

1st row is description, 2nd is URL, and last is license or terms of use (if any)

ImageNet

ImageNet is an image database organized according to the WordNet hierarchy (currently only the nouns), in which each node of the hierarchy is depicted by hundreds and thousands of images.
http://www.image-net.org/

Manga109

Manga109は，日本のプロの漫画家によって描かれた109冊の漫画で構成されています．それらは，1970年代から2010年代に公開された漫画であり, 対象読者層やジャンルも幅広く網羅しています．
http://www.manga109.org/index.php

Tiny Images Dataset

This page has links for downloading the Tiny Images dataset, which consists of 79,302,017 images, each being a 32x32 color image. This data is stored in the form of large binary files which can be accesed by a Matlab toolbox that we have written. You will need around 400Gb of free disk space to store all the files. In total there are 5 files that need to be downloaded, 3 of which are large binary files consisting of (i) the images themselves; (ii) their associated metadata (filename, search engine used, ranking etc.); (iii) Gist descriptors for each image. The other two files are the Matlab toolbox and index data file that together let you easily load in data from the binaries.
http://horatio.cs.nyu.edu/mit/tiny/data/

Labeled Faces in the Wild

The data set contains more than 13,000 images of faces collected from the web. Each face has been labeled with the name of the person pictured. 1680 of the people pictured have two or more distinct photos in the data set.
http://vis-www.cs.umass.edu/lfw/

notMNIST

I've taken some publicly available fonts and extracted glyphs from them to make a dataset similar to MNIST. There are 10 classes, with letters A-J taken from different fonts.
http://yaroslavvb.blogspot.jp/2011/09/notmnist-dataset.html

The CIFAR-10 dataset

The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images
http://www.cs.toronto.edu/~kriz/cifar.html

The Oxford-IIIT Pet Dataset

We have created a 37 category pet dataset with roughly 200 images for each class. The images have a large variations in scale, pose and lighting. All images have an associated ground truth annotation of breed, head ROI, and pixel level trimap segmentation.
http://www.robots.ox.ac.uk/~vgg/data/pets/

YouTube-8M Dataset

YouTube-8M is a large-scale labeled video dataset that consists of millions of YouTube video IDs and associated labels from a diverse vocabulary of 4700+ visual entities.
https://research.google.com/youtube8m/

YouTube-BoundingBoxes

YouTube-BoundingBoxes is a large-scale data set of video URLs with densely-sampled high-quality single-object bounding box annotations.
https://research.google.com/youtube-bb/

Open Images dataset

Open Images is a dataset of ~9 million URLs to images that have been annotated with labels spanning over 6000 categories.
https://github.com/openimages/dataset

CelebA Dataset

CelebFaces Attributes Dataset (CelebA) is a large-scale face attributes dataset with more than 200K celebrity images, each with 40 attribute annotations. The images in this dataset cover large pose variations and background clutter. CelebA has large diversities, large quantities, and rich annotations, including
http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html

IMDB-WIKI – 500k+ face images with age and gender labels

To the best of our knowledge this is the largest publicly available dataset of face images with gender and age labels for training. We provide pretrained models for both age and gender prediction.
https://data.vision.ee.ethz.ch/cvl/rrothe/imdb-wiki/

STAIR Captions

Our dataset consists of 820,310 Japanese captions for 164,062 images.
http://captions.stair.center/

Agricultural Human Detection and Tracking

The NREC Person Detection Dataset is a collection of off-road videos taken in an apple orchard and orange grove. The videos are collected with a set of visible people in a variety of outfits, locations, and times. We encourage you to train a detector on our dataset and submit your curves for display on this webpage.
http://www.nrec.ri.cmu.edu/projects/usdapersondetection/dataset/

fashion-mnist

Fashion-MNIST is a dataset of Zalando's article images—consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes.
https://github.com/zalandoresearch/fashion-mnist

DANBOORU2017: A LARGE-SCALE CROWDSOURCED AND TAGGED ANIME ILLUSTRATION DATASET

We create & provide a torrent which contains ~1.9tb of 2.94m images with 77.5m tag instances (of 333k defined tags, ~26.3/image) covering Danbooru from 24 May 2005 through 31 December 2017 (final ID: #2,973,532), providing the image files & a JSON export of the metadata.
https://www.gwern.net/Danbooru2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Image.md

Image.md

Image Dataset

ImageNet

Manga109

Tiny Images Dataset

Labeled Faces in the Wild

notMNIST

The CIFAR-10 dataset

The Oxford-IIIT Pet Dataset

YouTube-8M Dataset

YouTube-BoundingBoxes

Open Images dataset

CelebA Dataset

IMDB-WIKI – 500k+ face images with age and gender labels

STAIR Captions

Agricultural Human Detection and Tracking

fashion-mnist

DANBOORU2017: A LARGE-SCALE CROWDSOURCED AND TAGGED ANIME ILLUSTRATION DATASET

Files

Image.md

Latest commit

History

Image.md

File metadata and controls

Image Dataset

ImageNet

Manga109

Tiny Images Dataset

Labeled Faces in the Wild

notMNIST

The CIFAR-10 dataset

The Oxford-IIIT Pet Dataset

YouTube-8M Dataset

YouTube-BoundingBoxes

Open Images dataset

CelebA Dataset

IMDB-WIKI – 500k+ face images with age and gender labels

STAIR Captions

Agricultural Human Detection and Tracking

fashion-mnist

DANBOORU2017: A LARGE-SCALE CROWDSOURCED AND TAGGED ANIME ILLUSTRATION DATASET