support pre-training or fine-tuning schemes #1

bertsky · 2020-11-05T10:00:13Z

Thanks for sharing your work, it's awesome!

I am eager to train this on my own materials, but they are comparatively scarce and I don't have the computational capacities to train on the whole PubLayNet from scratch myself.

So I was wondering: What changes would be needed to continue training from your pre-trained models? Or even more elaborately, do you think it would be worthwhile trying to load an existing model, fix most of the weights, and add some additional layers to the FPN for fine-tuning?

lolipopshock · 2020-12-02T16:32:29Z

Thank you for your kind words!

And sorry for just seeing your issue. Sure, let me write a tutorial or build some exemplar code for the pre-training. Basically you just need to load any pre-trained weights we've provided, and repeat the training process.

bertsky · 2020-12-02T16:53:23Z

Oh, that would be great – thanks in advance!

I expect just initializing with your pre-trained models and training on new data would quickly make the model forget your large and broad initial dataset, because initial gradients will be large. Apart from layer fixation I also thought about reducing learning rate or imposing restrictive gradient clipping. But I guess I'll have to go through these experiments anyway...

Crnagora28 · 2021-04-12T22:05:18Z

Thanks for sharing your work, it's awesome!

I am eager to train this on my own materials, but they are comparatively scarce and I don't have the computational capacities to train on the whole PubLayNet from scratch myself.

So I was wondering: What changes would be needed to continue training from your pre-trained models? Or even more elaborately, do you think it would be worthwhile trying to load an existing model, fix most of the weights, and add some additional layers to the FPN for fine-tuning?

Same question here! And congratulations on this amazing tool.

joshcx · 2021-05-05T10:59:22Z

Would be great to have such a script/documentation! Thanks a lot!

nasheedyasin · 2021-05-21T06:30:52Z

Oh, that would be great – thanks in advance!

I expect just initializing with your pre-trained models and training on new data would quickly make the model forget your large and broad initial dataset, because initial gradients will be large. Apart from layer fixation I also thought about reducing learning rate or imposing restrictive gradient clipping. But I guess I'll have to go through these experiments anyway...

Any updates on your experiment @bertsky? I intend to something similar and would like to learn from your experience.

nasheedyasin · 2021-05-21T11:26:31Z

@lolipopshock I am trying to fine tune this model on my own custom data, having different classes than what the model was trained on.

Here is what I have done:

Changed the config param in ROI_HEADS>NUM_CLASSES from 5 to 3.

This lead to the following warning:

Skip loading parameter 'roi_heads.box_predictor.cls_score.weight' to the model due to incompatible shapes: (6, 1024) in the checkpoint but (4, 1024) in the model! You might want to double check if this is expected.
Skip loading parameter 'roi_heads.box_predictor.cls_score.bias' to the model due to incompatible shapes: (6,) in the checkpoint but (4,) in the model! You might want to double check if this is expected.
Skip loading parameter 'roi_heads.box_predictor.bbox_pred.weight' to the model due to incompatible shapes: (20, 1024) in the checkpoint but (12, 1024) in the model! You might want to double check if this is expected.
Skip loading parameter 'roi_heads.box_predictor.bbox_pred.bias' to the model due to incompatible shapes: (20,) in the checkpoint but (12,) in the model! You might want to double check if this is expected.
Skip loading parameter 'roi_heads.mask_head.predictor.weight' to the model due to incompatible shapes: (5, 256, 1, 1) in the checkpoint but (3, 256, 1, 1) in the model! You might want to double check if this is expected.
Skip loading parameter 'roi_heads.mask_head.predictor.bias' to the model due to incompatible shapes: (5,) in the checkpoint but (3,) in the model! You might want to double check if this is expected.
Some model parameters or buffers are not found in the checkpoint:
roi_heads.box_predictor.bbox_pred.{bias, weight}

Now here is what I expect has happened:

The weights for the ROI_HEADS aren't loaded and hence these weights will be randomly initialized.
All other weights (of the backbone e.t.c.) are initialized.
The model can now be trained on my custom data (can be used for transfer learning).

Please correct me if I am wrong :)

natasasdj · 2021-05-21T15:17:53Z

It would be really great to have a short tutorial on fine tuning on a custom data-set with custom labels starting from a pretrained model.

nasheedyasin · 2021-05-21T15:28:03Z

It would be really great to have a short tutorial on fine tuning on a custom data-set with custom labels starting from a pretrained model.

If I'm successful in fine tuning on a custom dataset, will definitely work towards making a tutorial of the same.

lolipopshock · 2021-05-22T06:09:35Z

The plan for updating the repo and creating a dedicated fine-tuning tutorial has been unintentionally delayed - will get back to this project very soon in one or two weeks and release the updates. Please stay tuned :)

VladyslavHerasymiuk · 2021-06-21T17:51:23Z

The plan for updating the repo and creating a dedicated fine-tuning tutorial has been unintentionally delayed - will get back to this project very soon in one or two weeks and release the updates. Please stay tuned :)

Hi! Any updates about a fine-tuning tutorial? I'm looking forward to it!

nasheedyasin · 2021-06-21T18:03:37Z

The plan for updating the repo and creating a dedicated fine-tuning tutorial has been unintentionally delayed - will get back to this project very soon in one or two weeks and release the updates. Please stay tuned :)

Hi! Any updates about a fine-tuning tutorial? I'm looking forward to it!

We recently updated a bunch of stuff to make the repo more flexible, I'll work on creating tutorial as and when I'm free, usually over the weekend.

nasheedyasin · 2021-06-29T15:03:12Z

Hi all,
Here's a draft of the tutorial to fine tune models using this repo.

I will close this issue when it is published and update a link to the published version of the tutorial.

lolipopshock · 2021-06-29T15:14:55Z

Hi all,
Here's a draft of the tutorial to fine tune models using this repo.

I will close this issue when it is published and update a link to the published version of the tutorial.

It seems the post is still not publicly available?

nasheedyasin · 2021-06-29T15:18:24Z

Hi all,

Here's a draft of the tutorial to fine tune models using this repo.

I will close this issue when it is published and update a link to the published version of the tutorial.

It seems the post is still not publicly available?

For now, to access the draft you'll need to be logged in to your medium account, once it's published (hopefully in a maximum of 3 days), you'll be able to publicly access it without logging in.

lolipopshock · 2021-06-29T16:01:37Z

Hi all,

Here's a draft of the tutorial to fine tune models using this repo.

I will close this issue when it is published and update a link to the published version of the tutorial.

It seems the post is still not publicly available?

For now, to access the draft you'll need to be logged in to your medium account, once it's published (hopefully in a maximum of 3 days), you'll be able to publicly access it without logging in.

Thanks - just took a quick look and it looks nice! Would you mind if I also include it in the layout-parser's website as a tutorial for model training in the future? We can talk about the details if you join the slack channel - thanks!

nasheedyasin · 2021-06-29T20:43:40Z

Would love that, I'll join the channel right away.

nasheedyasin · 2021-07-01T13:49:22Z

The tutorial is now live on Towards Data Science.

lolipopshock · 2022-02-14T22:39:46Z

See #10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support pre-training or fine-tuning schemes #1

support pre-training or fine-tuning schemes #1

bertsky commented Nov 5, 2020

lolipopshock commented Dec 2, 2020 •

edited

Loading

bertsky commented Dec 2, 2020

Crnagora28 commented Apr 12, 2021

joshcx commented May 5, 2021

nasheedyasin commented May 21, 2021

nasheedyasin commented May 21, 2021 •

edited

Loading

natasasdj commented May 21, 2021

nasheedyasin commented May 21, 2021

lolipopshock commented May 22, 2021 •

edited

Loading

VladyslavHerasymiuk commented Jun 21, 2021

nasheedyasin commented Jun 21, 2021

nasheedyasin commented Jun 29, 2021 •

edited

Loading

lolipopshock commented Jun 29, 2021

nasheedyasin commented Jun 29, 2021

lolipopshock commented Jun 29, 2021

nasheedyasin commented Jun 29, 2021 •

edited

Loading

nasheedyasin commented Jul 1, 2021

lolipopshock commented Feb 14, 2022

support pre-training or fine-tuning schemes #1

support pre-training or fine-tuning schemes #1

Comments

bertsky commented Nov 5, 2020

lolipopshock commented Dec 2, 2020 • edited Loading

bertsky commented Dec 2, 2020

Crnagora28 commented Apr 12, 2021

joshcx commented May 5, 2021

nasheedyasin commented May 21, 2021

nasheedyasin commented May 21, 2021 • edited Loading

natasasdj commented May 21, 2021

nasheedyasin commented May 21, 2021

lolipopshock commented May 22, 2021 • edited Loading

VladyslavHerasymiuk commented Jun 21, 2021

nasheedyasin commented Jun 21, 2021

nasheedyasin commented Jun 29, 2021 • edited Loading

lolipopshock commented Jun 29, 2021

nasheedyasin commented Jun 29, 2021

lolipopshock commented Jun 29, 2021

nasheedyasin commented Jun 29, 2021 • edited Loading

nasheedyasin commented Jul 1, 2021

lolipopshock commented Feb 14, 2022

lolipopshock commented Dec 2, 2020 •

edited

Loading

nasheedyasin commented May 21, 2021 •

edited

Loading

lolipopshock commented May 22, 2021 •

edited

Loading

nasheedyasin commented Jun 29, 2021 •

edited

Loading

nasheedyasin commented Jun 29, 2021 •

edited

Loading