Skip to content

Latest commit

 

History

History
69 lines (51 loc) · 5.03 KB

Image.md

File metadata and controls

69 lines (51 loc) · 5.03 KB

Image Dataset

This is a list of image dataset available to free of charge

1st row is description, 2nd is URL, and last is license or terms of use (if any)

ImageNet

  • ImageNet is an image database organized according to the WordNet hierarchy (currently only the nouns), in which each node of the hierarchy is depicted by hundreds and thousands of images.
  • http://www.image-net.org/

Manga109

  • Manga109は,日本のプロの漫画家によって描かれた109冊の漫画で構成されています.それらは,1970年代から2010年代に公開された漫画であり, 対象読者層やジャンルも幅広く網羅しています.
  • http://www.manga109.org/index.php

Tiny Images Dataset

  • This page has links for downloading the Tiny Images dataset, which consists of 79,302,017 images, each being a 32x32 color image. This data is stored in the form of large binary files which can be accesed by a Matlab toolbox that we have written. You will need around 400Gb of free disk space to store all the files. In total there are 5 files that need to be downloaded, 3 of which are large binary files consisting of (i) the images themselves; (ii) their associated metadata (filename, search engine used, ranking etc.); (iii) Gist descriptors for each image. The other two files are the Matlab toolbox and index data file that together let you easily load in data from the binaries.
  • http://horatio.cs.nyu.edu/mit/tiny/data/

Labeled Faces in the Wild

  • The data set contains more than 13,000 images of faces collected from the web. Each face has been labeled with the name of the person pictured. 1680 of the people pictured have two or more distinct photos in the data set.
  • http://vis-www.cs.umass.edu/lfw/

notMNIST

The CIFAR-10 dataset

The Oxford-IIIT Pet Dataset

  • We have created a 37 category pet dataset with roughly 200 images for each class. The images have a large variations in scale, pose and lighting. All images have an associated ground truth annotation of breed, head ROI, and pixel level trimap segmentation.
  • http://www.robots.ox.ac.uk/~vgg/data/pets/

YouTube-8M Dataset

  • YouTube-8M is a large-scale labeled video dataset that consists of millions of YouTube video IDs and associated labels from a diverse vocabulary of 4700+ visual entities.
  • https://research.google.com/youtube8m/

YouTube-BoundingBoxes

Open Images dataset

CelebA Dataset

  • CelebFaces Attributes Dataset (CelebA) is a large-scale face attributes dataset with more than 200K celebrity images, each with 40 attribute annotations. The images in this dataset cover large pose variations and background clutter. CelebA has large diversities, large quantities, and rich annotations, including
  • http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html

IMDB-WIKI – 500k+ face images with age and gender labels

STAIR Captions

Agricultural Human Detection and Tracking

  • The NREC Person Detection Dataset is a collection of off-road videos taken in an apple orchard and orange grove. The videos are collected with a set of visible people in a variety of outfits, locations, and times. We encourage you to train a detector on our dataset and submit your curves for display on this webpage.
  • http://www.nrec.ri.cmu.edu/projects/usdapersondetection/dataset/

fashion-mnist

  • Fashion-MNIST is a dataset of Zalando's article images—consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes.
  • https://github.com/zalandoresearch/fashion-mnist

DANBOORU2017: A LARGE-SCALE CROWDSOURCED AND TAGGED ANIME ILLUSTRATION DATASET

  • We create & provide a torrent which contains ~1.9tb of 2.94m images with 77.5m tag instances (of 333k defined tags, ~26.3/image) covering Danbooru from 24 May 2005 through 31 December 2017 (final ID: #2,973,532), providing the image files & a JSON export of the metadata.
  • https://www.gwern.net/Danbooru2017