DeepPose

NOTE: This is not official implementation. Original paper is DeepPose: Human Pose Estimation via Deep Neural Networks.

Requirements

Chainer 1.5+ (Neural network framework)
numpy 1.9+
scipy 0.16+
scikit-learn 0.15+
OpenCV 2.4+

Data preparation

bash shells/download.sh
python scripts/flic_dataset.py
python scripts/lsp_dataset.py

This script downloads FLIC-full dataset (http://vision.grasp.upenn.edu/cgi-bin/index.php?n=VideoLearning.FLIC) and perform cropping regions of human and save poses as numpy files into FLIC-full directory.

MPII Dataset

MPII Human Pose Dataset
of training images: 18079, # of test images: 6908
- test images don't have any annotations
- so we split trining imges into training/test joint set
- each joint set has
of training joint set: 17928, # of test joint set: 1991

Start training

For FLIC Dataset

Just run:

nohup python scripts/train.py > AlexNet_flic.log 2>&1 < /dev/null &

For speed:

CHAINER_TYPE_CHECK=0 nohup python scripts/train.py > AlexNet_flic.log 2>&1 < /dev/null &

It is same as:

nohup python scripts/train.py \
--model models/AlexNet_flic.py \
--gpu 0 \
--epoch 1000 \
--batchsize 32 \
--prefix AlexNet_LCN_AdaGrad_lr-0.0005 \
--snapshot 10 \
--datadir data/FLIC-full \
--channel 3 \
--flip 1 \
--size 220 \
--crop_pad_inf 1.5 \
--crop_pad_sup 2.0 \
--shift 5 \
--lcn 1 \
--joint_num 7 \
> AlexNet_LCN_AdaGrad_lr-0.0005.log 2>&1 &

--flip 1 means it performs LR flip augmentation, and --flip 0 does nothing. --lcn 1 means local(should be said "global"?) contrast normalization will be applied.

See the help messages with --help option for details.

GPU memory requirement

batchsize: 128 -> about 2870 MiB
batchsize: 64 -> about 1890 MiB
batchsize: 32 (default) -> 1374 MiB

Visualize Filters of 1st conv layer

Go to result dir of a model
python ../../scripts/draw_filters.py

Visualize Prediction

Example

Prediction and visualize them and calc mean errors

python scripts/evaluate_flic.py \
--model results/AlexNet_2015/AlexNet.py \
--param results/AlexNet_2015/AlexNet_epoch_400.chainermodel \
--datadir data/FLIC-full
--gpu 0 \
--batchsize 128 \
--mode test

Tile some randomly selected result images

python scripts/evaluate_flic.py \
--model results/AlexNet_2015/AlexNet_flic.py \
--param results/AlexNet_2015/AlexNet_epoch_450.chainermodel \
--mode tile \
--n_imgs 25

Create animated GIF to intuitively compare predictions and labels

cd results/AlexNet_2015
bash ../../scripts/create_anime.sh test_450_tiled_pred.jpg test_450_tiled_label.jpg test_450.gif

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

DeepPose

Requirements

Data preparation

MPII Dataset

of training images: 18079, # of test images: 6908

of training joint set: 17928, # of test joint set: 1991

Start training

For FLIC Dataset

GPU memory requirement

Visualize Filters of 1st conv layer

Visualize Prediction

Example

Prediction and visualize them and calc mean errors

Tile some randomly selected result images

Create animated GIF to intuitively compare predictions and labels

Files

README.md

Latest commit

History

README.md

File metadata and controls

DeepPose

Requirements

Data preparation

MPII Dataset

of training images: 18079, # of test images: 6908

of training joint set: 17928, # of test joint set: 1991

Start training

For FLIC Dataset

GPU memory requirement

Visualize Filters of 1st conv layer

Visualize Prediction

Example

Prediction and visualize them and calc mean errors

Tile some randomly selected result images

Create animated GIF to intuitively compare predictions and labels