Random Hot Deck Imputer#
A method of imputation similar to KNN Imputer but instead of computing a weighted average of the neighbors' features, Random Hot Deck picks a value from the neighborhood at random. This makes Random Hot Deck Imputer slightly less computationally complex while satisfying some balancing equations at the same time.
Note: NaN safe distance kernels, such as Safe Euclidean, are required for continuous features.
Data Type Compatibility: Depends on distance kernel
|1||k||5||int||The number of nearest neighbors to consider when imputing a value.|
|2||weighted||true||bool||Should we use the inverse distances as confidence scores when imputing values?|
|3||kernel||Safe Euclidean||Distance||The distance kernel used to compute the distance between sample points.|
|4||placeholder||'?'||string||The categorical placeholder variable denoting the category that contains missing values.|
This transformer does not have any additional methods.
use Rubix\ML\Transformers\RandomHotDeckImputer; use Rubix\ML\Kernels\Distance\SafeEuclidean; $transformer = new KNNImputer(20, true, new SafeEuclidean(), '?');
- C. Hasler et al. (2015). Balanced k-Nearest Neighbor Imputation.