Abstract
Feature selection is a critical component in data science and has been the topic of research for many years. Advances in hardware and the availability of better multiprocessing platforms have enabled parallel computing to reach very high levels of performance. Minimum Redundancy Maximum Relevance (mRMR) is a powerful feature selection technique used in many applications. In this paper, we present a novel optimized Single Program Multiple Data (SPMD) approach to implement the mRMR algorithm with synchronous computation, optimum load balancing and greater speedup than task-parallel approaches. The experimental results presented using multiple synthesized datasets prove the efficiency and scalability of the proposed technique over original mRMR.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Peng, H., Long, F., Ding, C.: Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27, 1226–1238 (2005)
Lu, D., Weng, Q.: A survey of image classification methods and techniques for improving classification performance. J. Remote Sens. 28, 823–870 (2007)
Jain, A., Zongker, D.: Feature selection: evaluation, application, and small sample performance. IEEE Trans. Pattern Anal. Mach. Intell. 19, 153–158 (1997)
Bhattacharyya, C., et al.: Simultaneous relevant feature identification and classification in high-dimensional spaces: application to molecular profiling data. Spec. Issue Genomic Sig. Process. 83(4), 729–743 (2003)
Alizadeh, A.A., et al.: Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403, 503–511 (2000)
Thomas, J.G., et al.: An efficient and robust statistical modeling approach to discover differentially expressed genes using genomic expression profiles. Genome Res. 11, 1227–1236 (2001)
Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97(1–2), 273–324 (1997)
Narendra, P., Fukunaga, K.: A branch and bound algorithm for feature subset selection. IEEE Trans. Comput. 6, 917–922 (1977)
Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97, 273–324 (1997)
Raymer, M.L., Punch, W.F., Goodman, E.D.: Dimensionality reduction using genetic algorithms. IEEE Trans. Evol. Comput. 4, 164–171 (2000)
Kennedy, J., Eberhart, R.C.: Particle swarm optimization. In: Proceedings of IEEE International Conference on Neural Networks, vol. IV, pp. 1942–1948 (1995)
Ververidis, D., Kotropoulos, C.: Sequential forward feature selection with low computational cost. In: 13th European Signal Processing Conference, Antalya, pp. 1–4 (2005)
LeKhac, N., Wu, B., Chen, C., Kechadi, M.T.: Feature selection parallel technique for remotely sensed imagery classification. In: Murgante, B. (eds.) Computational Science and Its Applications - ICCSA 2013. Lecture Notes in Computer Science, vol. 7972. Springer, Heidelberg (2013)
de Souza, J.T., Matwin, S., Japkowicz, N.: Parallelizing feature selection. Algorithmica 45(3), 433–456 (2006)
Ding, C., Peng, H.: Minimum redundancy feature selection from microarray gene expression data. J. Bioinform. Comput. Biol. 3, 523–528 (2003)
Ramírez-Gallego, S., et al.: An information theory-based feature selection framework for big data under apache spark. IEEE Trans. Syst. Man Cybern.: Syst. PP(99), 1–13 (2017)
Ramírez-Gallego, S., et al.: Fast-mRMR: fast minimum redundancy maximum relevance algorithm for high-dimensional big data. Int. J. Intell. Syst. 32, 134–152 (2017)
Reggiani, C., et al.: Feature selection in high-dimensional dataset using MapReduce. In: BNCAI (2017)
Le-Khac, N.-A.: Studying the performance of overlapping communication and computation by active message: Inuktitut case. In: International Conference on Parallel and Distributed Computing and Network (PDCN 2006), 12–14 February 2006, Innsbruck, Austria
Ayguadé, E.: Is the schedule clause really necessary in OpenMP? In: Voss, M.J. (eds.) OpenMP Shared Memory Parallel Programming. WOMPAT (2003). Lecture Notes in Computer Science, vol. 2716. Springer, Heidelberg (2003)
Tick, E.: NGCO, 7, p. 325 (1990). https://doi.org/10.1007/BF03037210
Alshamlan, H., Badr, G., Alohali, Y.: mRMR-ABC: a hybrid gene selection algorithm for cancer classification using microarray gene expression profiling. Biomed Res Int., Article ID 604910, 49–60 (2015)
Enireddy, V., PhaniKumar, D.V.V.S., Kishore, G.: Application of fisher score and mRMR techniques for feature selection in compressed medical images. Int. J. Eng. Technol. (IJET) 7(6), 2109–2121 (2016)
Kaya, H., et al.: Random forests for laughter detection. In: Proceedings of Workshop on Affective Social Speech Signals-in Conjunction with the INTERSPEECH (2013)
Alomari, O.A., et al.: MRMR BA: a hybrid gene selection algorithm for cancer classification. J. Theor. Appl. Inf. Technol. 95(12), 1 (2017)
Li, Z., et al.: A parallel feature selection method study for text classification. Neural Comput. Appl. 28(1), 513–524 (2017)
Zhou, Y., Porwal, U., Zhang, C., Ngo, H.Q., Nguyen, X., Ré, C., Govindaraju, V.: Parallel feature selection inspired by group testing. In: Advances in Neural Information Processing Systems, pp. 3554–3562 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Chanduka, B., Gangavarapu, T., Jaidhar, C.D. (2020). A Single Program Multiple Data Algorithm for Feature Selection. In: Abraham, A., Cherukuri, A.K., Melin, P., Gandhi, N. (eds) Intelligent Systems Design and Applications. ISDA 2018 2018. Advances in Intelligent Systems and Computing, vol 940. Springer, Cham. https://doi.org/10.1007/978-3-030-16657-1_62
Download citation
DOI: https://doi.org/10.1007/978-3-030-16657-1_62
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-16656-4
Online ISBN: 978-3-030-16657-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)