An Improved BPSO Algorithm for Feature Selection
In machine learning and data mining tasks, feature selection has been used to select the relevant subset of features. Traditionally, high-dimensional datasets have so many redundant and irrelevant features, which degrade the performance of clustering. Therefore, feature selection is necessary to improve the clustering performance. In this paper, we select the optimal subset of features and perform cluster analysis simultaneously using modified-BPSO (Binary Particle Swarm Optimization) and K-means. Optimality of clusters is measured by various cluster validation indices. By comparing the overall performance of the modified-BPSO with the BPSO and BMFOA (Binary Moth Flame Optimization Algorithm) on six real datasets drawn from the UC Irvine Machine Learning Repository, the results show that the performance of the proposed method is better than other methods involved in the paper.
KeywordsBPSO BMFO Cluster validation index Data clustering Feature selection Swarm intelligence
- 3.Prakash, J., & Singh, P. (2015). Particle swarm optimization with k-means for simultaneous feature selection and data clustering. In IEEE 2015 Second International Conference on Soft Computing and Machine Intelligence (ISCMI), Nov 23 (pp. 74–78).Google Scholar
- 5.Yang, S.: Nature-inspired optimization algorithms. Elsevier (2014).Google Scholar
- 9.Kennedy, J. (2011). Particle swarm optimization. In Encyclopedia of machine learning (pp. 760–766). US: Springer.Google Scholar
- 11.Kennedy, J., & Eberhart, R. (1997). A discrete binary version of the particle swarm algorithm. In 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation (Vol. 5). IEEE.Google Scholar
- 14.Reddy, S., Panwar, L., Panigrahi, B., & Kumar, R. (2017). Solution to unit commitment in power system operation planning using modified moth flame optimization algorithm (MMFOA): A flame selection based computational technique. Journal of Computational Science, 1–18.Google Scholar
- 15.Kaufman, L., & Rousseeuw, P. (2009). Finding groups in data: an introduction to cluster analysis. Probability and statistics. Wiley.Google Scholar
- 16.Bezdek, J., & Pal, N. (1995). Cluster validation with generalized Dunn’s indices. In Proceedings of the Second New Zealand International Two-Stream Conference on Artificial Neural Networks and Expert Systems, Nov 20 (pp. 190–193). IEEE.Google Scholar
- 18.Jain, A., & Dubes, R. (1988). Algorithms for clustering data (Vol. 6). Prentice Hall Englewood Cliffs.Google Scholar
- 19.Prakash, J., & Singh, P. (2012). An effective hybrid method based on de, ga, and k-means for data clustering. In Proceedings of the Second International Conference on Soft Computing for Problem Solving (SocProS 2012), Dec 28–30 (pp. 1561–1572). Springer.Google Scholar
- 21.Frank, A. UCI Machine Learning Repository. http://archive.ics.uci.edu/ml.