Random Hot Deck Imputer#
A method of imputation similar to KNN Imputer but instead of computing a weighted average of the neighbors' features, Random Hot Deck picks a value from the neighborhood randomly but sampled by distance. This makes Random Hot Deck Imputer slightly more computationally efficient while satisfying some balancing equations at the same time.
Note: NaN safe distance kernels, such as Safe Euclidean, are required for continuous features.
Data Type Compatibility: Depends on distance kernel
|1||k||5||int||The number of nearest neighbors to consider when imputing a value.|
|2||weighted||true||bool||Should we use the inverse distances as confidence scores when imputing values?|
|3||placeholder||'?'||string||The categorical placeholder denoting the category that contains missing values.|
|4||tree||BallTree||Spatial||The spatial tree used to run nearest neighbor searches.|
This transformer does not have any additional methods.
use Rubix\ML\Transformers\RandomHotDeckImputer; use Rubix\ML\Graph\Trees\BallTree; use Rubix\ML\Kernels\Distance\SafeEuclidean; $transformer = new KNNImputer(20, true, '?', new BallTree(50, new SafeEuclidean()));
- C. Hasler et al. (2015). Balanced k-Nearest Neighbor Imputation.