RMS Prop#

An adaptive gradient technique that divides the current gradient over a rolling window of the magnitudes of recent gradients. Unlike AdaGrad, RMS Prop does not suffer from an infinitely decaying step size.

Parameters#

#	Name	Default	Type	Description
1	rate	0.001	float	The learning rate that controls the global step size.
2	decay	0.1	float	The decay rate of the rms property.

Example#

use Rubix\ML\NeuralNet\Optimizers\RMSProp;

$optimizer = new RMSProp(0.01, 0.1);

References#

T. Tieleman et al. (2012). Lecture 6e rmsprop: Divide the gradient by a running average of its recent magnitude. ↩