[source]

AdaMax#

A version of the Adam optimizer that replaces the RMS property with the infinity norm of the past gradients. As such, AdaMax is generally more suitable for sparse parameter updates and noisy gradients.

Parameters#

# Param Default Type Description
1 rate 0.001 float The learning rate that controls the global step size.
2 momentum decay 0.1 float The decay rate of the accumulated velocity.
3 norm decay 0.001 float The decay rate of the infinity norm.

Example#

use Rubix\ML\NeuralNet\Optimizers\AdaMax;

$optimizer = new AdaMax(0.0001, 0.1, 0.001);

References#

  • D. P. Kingma et al. (2014). Adam: A Method for Stochastic Optimization.