Classification Tree#
A binary tree-based learner that greedily constructs a decision map for classification that minimizes the Gini impurity among the training labels within the leaf nodes. The height and bushiness of the tree can be determined by the user-defined max height
and max leaf size
hyper-parameters. Classification Trees also serve as the base learner of ensemble methods such as Random Forest and AdaBoost.
Interfaces: Estimator, Learner, Probabilistic, Ranks Features, Persistable
Data Type Compatibility: Categorical, Continuous
Parameters#
# | Name | Default | Type | Description |
---|---|---|---|---|
1 | maxHeight | PHP_INT_MAX | int | The maximum height of the tree. |
2 | maxLeafSize | 3 | int | The max number of samples that a leaf node can contain. |
3 | minPurityIncrease | 1e-7 | float | The minimum increase in purity necessary to continue splitting a subtree. |
4 | maxFeatures | Auto | int | The max number of feature columns to consider when determining a best split. |
5 | maxBins | Auto | int | The maximum number of bins to consider when determining a split with a continuous feature as the split point. |
Example#
use Rubix\ML\Classifiers\ClassificationTree;
$estimator = new ClassificationTree(10, 5, 0.001, null, null);
Additional Methods#
Return the number of levels in the tree.
public height() : ?int
Return a factor that quantifies the skewness of the distribution of nodes in the tree.
public balance() : ?int
References:#
Last update:
2021-10-31