This is a list of image dataset available to free of charge
1st row is description, 2nd is URL, and last is license or terms of use (if any)
- ImageNet is an image database organized according to the WordNet hierarchy (currently only the nouns), in which each node of the hierarchy is depicted by hundreds and thousands of images.
- http://www.image-net.org/
- Manga109は,日本のプロの漫画家によって描かれた109冊の漫画で構成されています.それらは,1970年代から2010年代に公開された漫画であり, 対象読者層やジャンルも幅広く網羅しています.
- http://www.manga109.org/index.php
- This page has links for downloading the Tiny Images dataset, which consists of 79,302,017 images, each being a 32x32 color image. This data is stored in the form of large binary files which can be accesed by a Matlab toolbox that we have written. You will need around 400Gb of free disk space to store all the files. In total there are 5 files that need to be downloaded, 3 of which are large binary files consisting of (i) the images themselves; (ii) their associated metadata (filename, search engine used, ranking etc.); (iii) Gist descriptors for each image. The other two files are the Matlab toolbox and index data file that together let you easily load in data from the binaries.
- http://horatio.cs.nyu.edu/mit/tiny/data/
- The data set contains more than 13,000 images of faces collected from the web. Each face has been labeled with the name of the person pictured. 1680 of the people pictured have two or more distinct photos in the data set.
- http://vis-www.cs.umass.edu/lfw/
- I've taken some publicly available fonts and extracted glyphs from them to make a dataset similar to MNIST. There are 10 classes, with letters A-J taken from different fonts.
- http://yaroslavvb.blogspot.jp/2011/09/notmnist-dataset.html
- The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images
- http://www.cs.toronto.edu/~kriz/cifar.html
- We have created a 37 category pet dataset with roughly 200 images for each class. The images have a large variations in scale, pose and lighting. All images have an associated ground truth annotation of breed, head ROI, and pixel level trimap segmentation.
- http://www.robots.ox.ac.uk/~vgg/data/pets/
- YouTube-8M is a large-scale labeled video dataset that consists of millions of YouTube video IDs and associated labels from a diverse vocabulary of 4700+ visual entities.
- https://research.google.com/youtube8m/
- YouTube-BoundingBoxes is a large-scale data set of video URLs with densely-sampled high-quality single-object bounding box annotations.
- https://research.google.com/youtube-bb/
- Open Images is a dataset of ~9 million URLs to images that have been annotated with labels spanning over 6000 categories.
- https://github.com/openimages/dataset
- CelebFaces Attributes Dataset (CelebA) is a large-scale face attributes dataset with more than 200K celebrity images, each with 40 attribute annotations. The images in this dataset cover large pose variations and background clutter. CelebA has large diversities, large quantities, and rich annotations, including
- http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html
- To the best of our knowledge this is the largest publicly available dataset of face images with gender and age labels for training. We provide pretrained models for both age and gender prediction.
- https://data.vision.ee.ethz.ch/cvl/rrothe/imdb-wiki/
- Our dataset consists of 820,310 Japanese captions for 164,062 images.
- http://captions.stair.center/
- The NREC Person Detection Dataset is a collection of off-road videos taken in an apple orchard and orange grove. The videos are collected with a set of visible people in a variety of outfits, locations, and times. We encourage you to train a detector on our dataset and submit your curves for display on this webpage.
- http://www.nrec.ri.cmu.edu/projects/usdapersondetection/dataset/
- Fashion-MNIST is a dataset of Zalando's article images—consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes.
- https://github.com/zalandoresearch/fashion-mnist
- We create & provide a torrent which contains ~1.9tb of 2.94m images with 77.5m tag instances (of 333k defined tags, ~26.3/image) covering Danbooru from 24 May 2005 through 31 December 2017 (final ID: #2,973,532), providing the image files & a JSON export of the metadata.
- https://www.gwern.net/Danbooru2017