Source

Regression Tree#

A decision tree based on the CART (Classification and Regression Tree) learning algorithm that performs greedy splitting by minimizing the variance of the labels at each node split.

Interfaces: Estimator, Learner, Persistable

Data Type Compatibility: Categorical, Continuous

Parameters#

# Param Default Type Description
1 max depth PHP_INT_MAX int The maximum depth of a branch in the tree.
2 max leaf size 3 int The maximum number of samples that a leaf node can contain.
3 max features Auto int The maximum number of features to consider when determining a best split.
4 min purity increase 1e-7 float The minimum increase in purity necessary for a node not to be post pruned.

Additional Methods#

Return the normalized feature importances i.e. the proportion that each feature contributes to the overall model, indexed by feature column:

public featureImportances() : array

Display a human readable text representation of the decision tree:

public printrules() : void

Return the height of the tree:

public height() : int

Return the balance factor of the tree:

public balance() : int

Example#

use Rubix\ML\Regressors\RegressionTree;

$estimator = new RegressionTree(20, 2, null, 1e-3);

References:#

  • W. Y. Loh. (2011). Classification and Regression Trees.
  • K. Alsabti. et al. (1998). CLOUDS: A Decision Tree Classifier for Large Datasets.