Warper and HPE training hyperparams #9

ifeherva · 2022-12-27T19:00:01Z

I noticed that the learning for both warper and hpe were fixed during training (100k/50k) iterations according to the paper. With these params I see the loss stagnating after a few thousand iterations.

Have you tried LR decay or different params? Why did you use such a high number of iterations, did you have steady decrease of loss over 50k iterations?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Warper and HPE training hyperparams #9

Warper and HPE training hyperparams #9

ifeherva commented Dec 27, 2022

Warper and HPE training hyperparams #9

Warper and HPE training hyperparams #9

Comments

ifeherva commented Dec 27, 2022