Fix adam optimizer #70

janhuenermann · 2016-08-04T18:00:22Z

Hi Andrej,

I recently ran the trainer demo on MNIST and wondered why the Adam optimizer performs so much more worse than Adadelta.

I think I found a little bug in the Adam implementation.

According to the Adam Paper-v8 https://arxiv.org/pdf/1412.6980v8.pdf Algorithm 1 (p. 2) the bias estimates use division instead of multiplication. The fixed version behaves significantly better when running the trainer demo on MNIST. To get the results as below I also changed the learning rate to 0.001 and the beta2 parameter to 0.999 (from 0.01 and 0.99 respectively) as recommended in the paper.

Before:

After:

According to the Adam Paper-v8 https://arxiv.org/pdf/1412.6980v8.pdf Algorithm 1 (p. 2). Behaves significantly better when running the trainer demo on MNIST (even better when changing adam learning rate to recommended 0.001) .

GO1984 · 2018-06-01T15:02:42Z

Thank you so much! Can someone please proof this and push this into release (on npm too please)?

Fix adam optimizer

2356c62

According to the Adam Paper-v8 https://arxiv.org/pdf/1412.6980v8.pdf Algorithm 1 (p. 2). Behaves significantly better when running the trainer demo on MNIST (even better when changing adam learning rate to recommended 0.001) .

ratajczak mentioned this pull request Jan 29, 2019

Fixed bias correction error in adam optimizer #107

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix adam optimizer #70

Fix adam optimizer #70

janhuenermann commented Aug 4, 2016 •

edited

Loading

GO1984 commented Jun 1, 2018 •

edited

Loading

Fix adam optimizer #70

Are you sure you want to change the base?

Fix adam optimizer #70

Conversation

janhuenermann commented Aug 4, 2016 • edited Loading

GO1984 commented Jun 1, 2018 • edited Loading

janhuenermann commented Aug 4, 2016 •

edited

Loading

GO1984 commented Jun 1, 2018 •

edited

Loading