A Self-generating Prototype Method Based on Information Entropy Used for Condensing Data in Classification Tasks
This paper presents a new self-generating prototype method based on information entropy to reduce the size of training datasets. The method accelerates the classifier training time without significantly decreasing the quality in the data classification task. The effectiveness of the proposed method is compared to the K-nearest neighbour classifier (kNN) and the genetic algorithm prototype selection (GA). kNN is a benchmark method used for data classification tasks, while GA is a prototype selection method that provides competitive optimisation of accuracy and the data reduction ratio. Considering thirty different public datasets, the results of the comparisons demonstrate that the proposed method outperforms kNN when using the original training set as well as the reduced training set obtained via GA prototype selection.
KeywordsPrototype Selection (PS) Data reduction Data classification Genetic Algorithm (GA)
- 1.Acampora, G., Tortora, G., Vitiello, A.: Applying SPEA2 to prototype selection for nearest neighbor classification. In: 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 003924–003929. IEEE (2016)Google Scholar
- 16.Wu, X., Kumar, V.: The Top Ten Algorithms in Data Mining. CRC Press, Boca Raton (2009)Google Scholar