Source

Grid Search#

Grid Search is an algorithm that optimizes hyper-parameter selection. From the user's perspective, the process of training and predicting is the same, however, under the hood, Grid Search trains one estimator per combination of parameters and the best model is selected as the base estimator.

Note: You can choose the hyper-parameters manually or you can generate them randomly or in a grid using the Params helper.

Interfaces: Estimator, Learner, Parallel, Persistable, Verbose

Data Type Compatibility: Depends on base learner

Parameters#

# Param Default Type Description
1 base string The fully qualified class name of the base Estimator.
2 grid array An array of tuples where each tuple contains the possible values of each parameter in the order they are given to the base learner's constructor.
3 metric Auto object The validation metric used to score each set of hyper-parameters.
4 validator KFold object An instance of a validator object (HoldOut, KFold, etc.) that will be used to test each model.

Note: The default validation metrics are F Beta for classifiers and anomaly detectors, R Squared for regressors, and V Measure for clusterers.

Additional Methods#

Return an array of every possible combination of hyper-parameters:

public combinations() : array

An n-tuple containing the best parameters based on their validation score:

public best() : array

Return the underlying base estimator:

public estimator() : Estimator

Example#

use Rubix\ML\GridSearch;
use Rubix\ML\Classifiers\KNearestNeighbors;
use Rubix\ML\Kernels\Distance\Euclidean;
use Rubix\ML\Kernels\Distance\Manhattan;
use Rubix\ML\CrossValidation\Metrics\FBeta;
use Rubix\ML\CrossValidation\KFold;

$grid = [
    [1, 3, 5, 10], [true, false], [new Euclidean(), new Manhattan()],
];

$estimator = new GridSearch(KNearestNeightbors::class, $grid, new FBeta(), new KFold(5));

$estimator->train($dataset);

var_dump($estimator->best());
array(3) {
  [0]=> int(3)
  [1]=> bool(true)
  [2]=> object(Rubix\ML\Kernels\Distance\Manhattan) {}
}