Doubt on Train Prototext in this repo. Anyone can repeat their training ever? #130

KleinYuan · 2017-05-26T19:36:29Z

@bittnt
I tried to repeat the training process on PASCAL VOC 2012 (20classes plus background).
Here's my documents: https://github.com/KleinYuan/train-crfasrnn

And here's the prototext I used for training, which is exact same as what you posted in here.

I trained with fcn-8s model and this caffe version and after around 200k iterations. The result was pretty bad (even much worse than fcn-8s) as attached.

Then I thought that I may miss something, so I used this tool to extract info about layers on your pre-trained caffemodel and the result kinda surprised me that I cannot find the MultiStageMeanfield as well as multi_stage_meanfield_param on 57th layer. Were u training with a different architecture or ?

I am quite confused now and it will be great for you to give me some hint and potentially share the actual training prototext?

Note:
At beginning I realized that I used a newer caffe than this repo and therefore I need to add crop_param in here, here and here. After adding those, with your pre-trained model, the demo script output expected images. Therefore, I think the training pipeline and test scripts are ok.

Attached Code, 57th layer of the Caffe -> Json on your pre-trained model:

{
      "blobs_lr": [
        10000,
        10000,
        1000
      ],
      "blobs": [
        {
          "channels": 1,
          "width": 21,
          "num": 1,
          "data": [
            2.684922456741333,
            -0.006915332283824682,
            0.006461392156779766,
            0.014774330891668797,
            0.017609767615795135,
            "(436 elements more)"
          ],
          "height": 21
        },
        {
          "channels": 1,
          "width": 21,
          "num": 1,
          "data": [
            4.664008140563965,
            -0.007745872251689434,
            0.011939325369894505,
            0.013802499510347843,
            0.017575804144144058,
            "(436 elements more)"
          ],
          "height": 21
        },
        {
          "channels": 1,
          "width": 21,
          "num": 1,
          "data": [
            -0.7012288570404053,
            0.008244414813816547,
            -0.008532184176146984,
            -0.013231083750724792,
            -0.01634358800947666,
            "(436 elements more)"
          ],
          "height": 21
        }
      ],
      "top": [
        "upscore"
      ],
      "name": "inference1",
      "bottom": [
        "unary",
        "Q0",
        "data_data_rgb_0_split_2"
      ]
    }

Where as, here's my 57th layer:

{
      "blobs": [
        {
          "shape": {
            "dim": [
              1,
              1,
              21,
              21
            ]
          },
          "data": [
            1.8576951026916504,
            0.08358041942119598,
            0.0802476778626442,
            0.06616068631410599,
            0.07730358093976974,
            "(436 elements more)"
          ]
        },
        {
          "shape": {
            "dim": [
              1,
              1,
              21,
              21
            ]
          },
          "data": [
            4.411709785461426,
            0.05498894676566124,
            0.10218091309070587,
            0.0351337306201458,
            0.04846448451280594,
            "(436 elements more)"
          ]
        },
        {
          "shape": {
            "dim": [
              1,
              1,
              21,
              21
            ]
          },
          "data": [
            -0.1661415994167328,
            -0.5631195306777954,
            -0.6274277567863464,
            -0.5293889045715332,
            -0.5286368131637573,
            "(436 elements more)"
          ]
        }
      ],
      "bottom": [
        "unary",
        "Q0",
        "data_data_rgb_0_split_2"
      ],
      "top": [
        "pred"
      ],
      "multi_stage_meanfield_param": {
        "num_iterations": 5,
        "compatibility_mode": 0,
        "theta_alpha": 59,
        "bilateral_filter_weights_str": "5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5",
        "threshold": 2,
        "theta_gamma": 3,
        "theta_beta": 3,
        "spatial_filter_weights_str": "3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3"
      },
      "param": [
        {
          "lr_mult": 10000
        },
        {
          "lr_mult": 10000
        },
        {
          "lr_mult": 1000
        }
      ],
      "phase": 0,
      "type": "MultiStageMeanfield",
      "name": "inference1"
    }

The text was updated successfully, but these errors were encountered:

KleinYuan · 2017-05-26T19:44:01Z

Some results:

FCN-8s
182k iterations one
212k iterations one
CRF as RNN pre trained model one

bittnt · 2017-05-27T16:30:49Z

Hi, Thanks for your contribution!
Regarding the results,

It looks like caffe2 does not have multistagemeanfield layers. You might get this layer in python somehow. But given the results, I am not sure if the weights have been transferred successfully.
Also, your learning rate might be too high, looks like your loss increases after training. It would be good practice to check IOU/per-pixel accuracy every 2k/4k iterations.
One thing is that you might consider to extract the FCN-8 component from CRFasRNN pretrained model first, then you should first train it to see if the loss decrease any.
Our pretrained model was obtained by first fine-tuning the plain FCN-32s network (without the CRF-RNN part) on COCO data, then building built an FCN-8s network with the learnt weights, and finally training the CRF-RNN network end-to-end using VOC 2012 training data only.
Newer caffe has been merged with crfasrnn CPU/GPU multistagemeanfield layer in https://github.com/torrvision/caffe/tree/crfrnn

damiVongola · 2017-06-23T16:51:07Z

@bittnt Hey i was wondering if you could let us know how many iterations it took for you to get your results. I have read the paper a couple of times and i cannot see any mention about the number of iterations you used (If it is mentioned there somewhere and i missed it, i'm very sorry). Also as a side note, i am currently training the crf-rnn on 6 channel satellite images and it is quite the hassle. Is it recommended to train an fcn-8s first on this 6-channel data before plugging in the meanfield iteration layer? I copied weights for the first 3 channels of the fcn-8s caffemodel, then i randomized the last three channels of the model. This new model is what i am using for training/finetuning. Are there any concerns, tips or issues you can give me on this. Thanks!!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Doubt on Train Prototext in this repo. Anyone can repeat their training ever? #130

Doubt on Train Prototext in this repo. Anyone can repeat their training ever? #130

KleinYuan commented May 26, 2017 •

edited

Loading

KleinYuan commented May 26, 2017

bittnt commented May 27, 2017

damiVongola commented Jun 23, 2017

Doubt on Train Prototext in this repo. Anyone can repeat their training ever? #130

Doubt on Train Prototext in this repo. Anyone can repeat their training ever? #130

Comments

KleinYuan commented May 26, 2017 • edited Loading

KleinYuan commented May 26, 2017

bittnt commented May 27, 2017

damiVongola commented Jun 23, 2017

KleinYuan commented May 26, 2017 •

edited

Loading