Instance selection and feature extraction using cuttlefish optimization algorithm and principal component analysis using decision tree
Instance selection and feature extraction is one of the most important task in data mining, due to the huge amount of data is constantly being produced in many fields. If the dataset is very large means most of the existing machine learning algorithms are inapplicable to handle such huge amount of data and computational cost is high. Two of the approaches have been used for solving this problem. One is scaling up algorithms and another one is data reduction. Scaling up data mining algorithm is not always feasible, but data reduction is possible. In this paper we take both, instance selection and feature extraction for data reduction. Instance selection is a technique that will reduce the size of the original training data. Feature extraction is input data having m dimension space that should be mapped into lower dimension space i.e., eliminate those components which are contributing less information. In this paper Cuttlefish optimization algorithm is used for instance selection, while principal component analysis is used for feature extraction. The combination of feature extraction and instance selection will reduce the large amount of computational time of training the classifiers. The optimal extracted subset of data points and reduced feature space are providing almost similar detection rate, accuracy rate, false positive rate and takes less amount of computational time for training the classifiers what we obtained from using original dataset.
KeywordsCuttlefish optimization algorithm Principal component analysis Feature extraction and instance selection
- 1.Huan, L., Motoda, H.: Instance Selection and Construction for Data Mining The Kluwer International Series in Engineering and Computer Science. Springer, New York (2001)Google Scholar
- 8.Kordas, M., Klos-Witkowska, A.: Increasing speed of genetic algorithm based instance selection. In: The 8th IEEE international conference on intelligent data acquisition and advanced computing system: technology and applications, September 2015, Warsaw, Poland (2015)Google Scholar
- 16.Eesa, A.S., Brifcani, A.M.A., Orman, Z.: A new tool for global optimization problems-cuttlefish algorithm. Int. J. Math. Comput. Stat. Nat. Phys. Eng. 8(9), 1203–1207 (2014)Google Scholar