An unsupervised imputer that replaces missing values in datasets with the weighted average according to the sample's k nearest neighbors.
Note: NaN safe distance kernels, such as Safe Euclidean, are required for continuous features.
Data Type Compatibility: Depends on distance kernel
|1||k||5||int||The number of nearest neighbors to consider when imputing a value.|
|2||weighted||true||bool||Should we use the inverse distances as confidence scores when imputing values?|
|3||kernel||Safe Euclidean||object||The distance kernel used to compute the distance between sample points.|
|4||placeholder||'?'||string||The categorical placeholder variable denoting the category that contains missing values.|
This transformer does not have any additional methods.
use Rubix\ML\Transformers\KNNImputer; use Rubix\ML\Kernels\Distance\Gower; $transformer = new KNNImputer(10, false, new Gower(), '?');
- O. Troyanskaya et al. (2001). Missing value estimation methods for DNA microarrays.