AdaMax#

A version of the Adam optimizer that replaces the RMS property with the infinity norm of the past gradients. As such, AdaMax is generally more suitable for sparse parameter updates and noisy gradients.

Parameters#

#	Name	Default	Type	Description
1	rate	0.001	float	The learning rate that controls the global step size.
2	momentumDecay	0.1	float	The decay rate of the accumulated velocity.
3	normDecay	0.001	float	The decay rate of the infinity norm.

Example#

use Rubix\ML\NeuralNet\Optimizers\AdaMax;

$optimizer = new AdaMax(0.0001, 0.1, 0.001);

References#

D. P. Kingma et al. (2014). Adam: A Method for Stochastic Optimization. ↩

Last update: 2021-03-03