Sergey Babkin on CEP and stuff: optimization 12 - adaptable methods of neural network training

Sunday, December 25, 2022

optimization 12 - adaptable methods of neural network training

In the last post I've been wondering, why aren't the momentum descent methods widely used for the neural network training. And guess what, a quick search has showed that they are, as well as the auto-scaling methods.

In particular, here is an interesting paper:

https://arxiv.org/pdf/1412.6980.pdf

Adam: A Method for Stochastic Optimization

What they do looks actually more like a momentum method than selecting the rate of descent. This paper has references to a few other methods:

AdaGrad
RMSProp
vSGD
AdaDelta

And I've found this algorithm through another web site that talks about various ML methods:

https://machinelearningmastery.com/adam-optimization-from-scratch/

Sergey Babkin on CEP and stuff

Sunday, December 25, 2022

optimization 12 - adaptable methods of neural network training

No comments:

Post a Comment

Links

About Me

Labels

Blog Archive