Skip to content

Latest commit

 

History

History
19 lines (17 loc) · 1.89 KB

README.md

File metadata and controls

19 lines (17 loc) · 1.89 KB

Data Description

PaddleMM Provides processing of multi-modal data including text and image. The folder for storing data sets are organized as follows:

  • images (Store the original images of the dataset)
  • img_feat.npy (Image region features extracted by Faster-RCNN)
  • img_box.npy (The location information of the image area extracted by Faster-RCNN)
  • dataset.json (Store the relevant information of the original data set, such as text, data set division, label, etc. See how to read from paddlemm/datasets/reader)

MS-COCO Dataset

To obtain the standard data loading format of the toolkit, the MS-COCO dataset needs to be processed as follows:

  • Step 1. Download COCO2014 Tran/Val images and captions here , merge the training set images and validation set images into 'images' folder.
  • Step 2. Download COCO processing and dividing files by Andrej Karpathy here .
  • Step 3. Download COCO regional features and location information extracted by Faster-RCNN here .
  • Step 4. Use paddlemm/scripts/coco_region.py and paddlemm/scripts/coco_label.py to process the original data to get image features and labels.

Twitter Dataset

If you want to try the visualization module of fusion task,please download the dataset and modify the configuration as follows:

  • Step 1. Download each tweet's associated image here .
  • Step 2. Download the Twitter-17 dataset here, and the Twitter-15 dataset here.
  • Step 3. Modify configuration parameters, for example dataset: "twitter", data_mode: "twitter", visual: "tsne", choose: "fusion".