[source]

Isolation Forest#

An ensemble of Isolation Trees all of which specialize on a unique subset of the training set. Isolation Trees are a type of randomized decision tree that assign anomaly scores based on the depth a sample reaches when traversing the tree. Anomalies are isolated into the shallowest leaf nodes and as such receive the highest isolation scores.

Interfaces: Estimator, Learner, Ranking, Persistable

Data Type Compatibility: Categorical, Continuous

Parameters#

# Param Default Type Description
1 estimators 100 int The number of isolation trees to train in the ensemble.
2 ratio null float The ratio of samples to train each estimator with. If null, the subsample size will be set to 256.
3 contamination null float The proportion of outliers that are assumed to be present in the training set. If null, the threshold anomaly score will be set to 0.5.

Example#

use Rubix\ML\AnomalyDetectors\IsolationForest;

$estimator = new IsolationForest(100, 0.2, 0.05);

Additional Methods#

This estimator does not have any additional methods.

References#

  • F. T. Liu et al. (2008). Isolation Forest.
  • F. T. Liu et al. (2011). Isolation-based Anomaly Detection.
  • M. Garchery et al. (2018). On the influence of categorical features in ranking anomalies using mixed data.