which pre-train model should we use for fine-tuning #36

Aniruddha-JU · 2022-08-25T08:29:55Z

I have pre-trained the IndicBART model on new monolingual data, and in the model path two models are saved 1) IndicBART and 2) IndicBART_puremodel. Now which should we use during the fine-tuning?

Aniruddha-JU · 2022-08-25T08:30:53Z

IndicBART size is 2.4 GB and pure_model size is 932.

prajdabre · 2022-08-25T09:14:41Z

Either.

Use the pure model with the flag --pretrained_model

Use the larger model with the flag --pretrained_model and an additional flag --no_reload_optimizer_ctr_and_scheduler

The larger checkpoint contains optimizer and scheduler states so you can resume pretraining in case of crash. During fine tuning resetting the optimizer is more common.

prajdabre mentioned this issue Feb 9, 2023

Extending IndicBART or IndicBERT #57

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

which pre-train model should we use for fine-tuning #36

which pre-train model should we use for fine-tuning #36

Aniruddha-JU commented Aug 25, 2022

Aniruddha-JU commented Aug 25, 2022

prajdabre commented Aug 25, 2022

which pre-train model should we use for fine-tuning #36

which pre-train model should we use for fine-tuning #36

Comments

Aniruddha-JU commented Aug 25, 2022

Aniruddha-JU commented Aug 25, 2022

prajdabre commented Aug 25, 2022