k-Nearest Neighbor Classification
The k-nearest neighbor (k-NN) method is one of the data mining techniques considered to be among the top 10 techniques for data mining . The k-NN method uses the well-known principle of Cicero pares cum paribus facillime congregantur (birds of a feather flock together or literally equals with equals easily associate). It tries to classify an unknown sample based on the known classification of its neighbors. Let us suppose that a set of samples with known classification is available, the so-called training set. Intuitively, each sample should be classified similarly to its surrounding samples. Therefore, if the classification of a sample is unknown, then it could be predicted by considering the classification of its nearest neighbor samples. Given an unknown sample and a training set, all the distances between the unknown sample and all the samples in the training set can be computed. The distance with the smallest value corresponds to the sample in the training set closest to the unknown sample. Therefore, the unknown sample may be classified based on the classification of this nearest neighbor.
KeywordsUnknown Sample Data Mining Technique Consistent Subset Regional Spectral Model Representative Prototype
Unable to display preview. Download preview PDF.