Abstract
A novel approach to combining feature selection and clustering is presented. It uses selection of weighted Principal Components for features selection and automatic clustering based on Improved DE for clustering in order to reduce the complexity of high dimensional datasets and speed up the DE clustering process. We report significant improvements in total runtime. Moreover, the clustering accuracy of the dimensionality reduction DE clustering algorithm is comparable to the one that uses full dimensional datasets. The efficiency of this approach has been demonstrated with some real life datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ben-Dor, A., Friedman, N., Yakhini, Z.: Class discovery in gene expression data. In: Procs. RECOMB, pp. 31–38 (2001)
Law, M.H., Jain, A.K., Figueiredo, M.A.T.: Feature selection in mixture-based clustering. In: Advances in Neural Information Processing Systems, vol. 15 (2003) (to appear)
Heydebreck, A.V., Huber, W., Poustka, A., Vingron, M.: Identifying splits with clear separation: A new class discovery method for gene expression data. Bioinformatics 17 (2001)
Kim, S.B., Rattakorn, P.: Unsupervised Feature Selection Using Weighted Principal Components (2010)
Das, S., Konar, A., Braham, A.: Automatic Clustering Using an Improved Differential Evolution Algorithm. IEEE Transactions on Systems, Man, and Cybernetics—Part a: Systems and Humans 38(1) (January 2008)
Boutsidis, C., Mahoney, M.W., Drineas: Unsupervised Feature Selection for Principal Components Analysis. In: KDD 2008, Las Vegas, Nevada, USA, August 24-27 (2008)
Jolliffe, I.T.: Principal Component Analysis. Springer, New York (2002)
Vermaat, M.B., Ion, R.A., Does, R.J.M.M., Klaassen, C.A.J.: A comparison of Shewhart individuals control charts based on normal, non-parametric, and extreme-value theory. Quality and Reliability Engineering International 19, 337–353 (2003)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Naik, A., Satapathy, S.C. (2014). Efficient Clustering of Dataset Based on Differential Evolution. In: Satapathy, S., Udgata, S., Biswal, B. (eds) Proceedings of the International Conference on Frontiers of Intelligent Computing: Theory and Applications (FICTA) 2013. Advances in Intelligent Systems and Computing, vol 247. Springer, Cham. https://doi.org/10.1007/978-3-319-02931-3_26
Download citation
DOI: https://doi.org/10.1007/978-3-319-02931-3_26
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-02930-6
Online ISBN: 978-3-319-02931-3
eBook Packages: EngineeringEngineering (R0)