This is a simply and raw code using DenseNet to do human pose estimation task.
This url is a DenseNet model stucture definition based on Tensorflow.
This is a basic stucture of densenet:
I use 3 blocks, growth rate 12, and I add two fully_connected layers to do keypoints regression task.
DenseNet_Kinect_train.py is training code, including model stucture definition, data_processing and training phase with GradientDescentOptimizer.
P.S. This code is coarse, raw, I didn't consider too much about time consuming and runing efficiency.
Input: Batch of 200×200 depth images, three channels, and of course it could be RGB image, containing one single person body.
Output: n joints(in my code,n = 11) coordinate, a n×2 vector.
These two dataset might help to provide training data:
ITOP by Feifei Li et al, https://www.albert.cm/projects/viewpoint_3d_pose/ ,
and NTU-RGBD by Jun Liu et al, http://rose1.ntu.edu.sg/datasets/actionrecognition.asp
Here are some results,
And these images are aquired by Kinect V2, normalized to 0~255, copy paste to 3 channels.