[source]

Gradient Boost#

Gradient Boost (GBM) is a stage-wise additive ensemble that uses a Gradient Descent boosting scheme for training boosters (Decision Trees) to correct the error residuals of a series of weak base learners. Stochastic gradient boosting is achieved by varying the ratio of samples to subsample uniformly at random from the training set. GBM also utilizes progress monitoring via an internal validation set for snapshotting and early stopping.

Note: The default base regressor is a Dummy Regressor using the Mean strategy and the default booster is a Regression Tree with a max height of 3.

Note: If there are not enough training samples to build an internal validation set with the user-specified holdout ratio then progress monitoring will be disabled.

Interfaces: Estimator, Learner, Verbose, Ranks Features, Persistable

Data Type Compatibility: Depends on base learners

Parameters#

# Param Default Type Description
1 booster RegressionTree Learner The regressor that will fix up the error residuals of the weak base learner.
2 rate 0.1 float The learning rate of the ensemble i.e. the shrinkage applied to each step.
3 ratio 0.5 float The ratio of samples to subsample from the training set to train each booster.
4 estimators 1000 int The maximum number of boosters to train in the ensemble.
5 min change 1e-4 float The minimum change in the training loss necessary to continue training.
6 window 10 int The number of epochs without improvement in the validation score to wait before considering an early stop.
7 hold out 0.1 float The proportion of training samples to use for progress monitoring.
8 metric RMSE Metric The metric used to score the generalization performance of the model during training.
9 base DummyRegressor Learner The weak base learner to be boosted.

Example#

use Rubix\ML\Regressors\GradientBoost;
use Rubix\ML\Regressors\RegressionTree;
use Rubix\ML\CrossValidation\Metrics\SMAPE;
use Rubix\ML\Regressors\DummyRegressor;
use Rubix\ML\Other\Strategies\Constant;

$estimator = new GradientBoost(new RegressionTree(3), 0.1, 0.8, 1000, 1e-4, 10, 0.1, new SMAPE(), new DummyRegressor(new Constant(0.0)));

Additional Methods#

Return the validation score at each epoch from the last training session:

public scores() : float[]|null

Return the loss at each epoch from the last training session:

public steps() : float[]|null

References#

  • J. H. Friedman. (2001). Greedy Function Approximation: A Gradient Boosting Machine.
  • J. H. Friedman. (1999). Stochastic Gradient Boosting.
  • Y. Wei. et al. (2017). Early stopping for kernel boosting algorithms: A general analysis with localized complexities.