Machine Learning

, Volume 108, Issue 11, pp 2009–2034 | Cite as

A distributed feature selection scheme with partial information sharing

  • Aida BrankovicEmail author
  • Luigi Piroddi


This paper introduces a novel feature selection and classification method, based on vertical data partitioning and a distributed searching architecture. The features are divided into subsets, each of which is associated to a dedicated processor that performs a local search. When all local selection processes are completed, each processor shares the features of its locally selected model with all other processors, and the local searches are repeated until convergence. Thanks to the vertical partitioning and the distributed selection scheme, the presented method is capable of addressing relatively large scale examples. The procedure is efficient since the local processors perform the selection tasks in parallel and on much smaller search spaces. Another important feature of the proposed method is its tendency to produce simple model structures, which is generally advantageous for the interpretability and robustness of the classifier. The proposed approach is evaluated and compared to other well-known feature selection and classification approaches proposed in the literature on several benchmark datasets. The obtained results demonstrate the effectiveness of the proposed approach, both in terms of classification accuracy and computational time.


Feature selection Classification Model selection Distributed optimization Parallel processing 



  1. Banerjee, M., & Chakravarty, S. (2011). Privacy preserving feature selection for distributed data using virtual dimension. In Proceedings of the \(20{th}\) ACM international conference on Information and knowledge management (pp. 2281–2284).Google Scholar
  2. Ben-David, A. (2008). Comparison of classification accuracy using Cohen’s weighted kappa. Expert Systems with Applications, 34(2), 825–832.CrossRefGoogle Scholar
  3. Bolón-Canedo, V., Sánchez-Maroño, N., & Alonso-Betanzos, A. (2015b). A distributed feature selection approach based on a complexity measure. In International work-conference on artificial neural networks (pp. 15–128). Spain: Palma de Mallorca.Google Scholar
  4. Bolón-Canedo, V., Sánchez-Marono, N., & Cerviño-Rabuñal, J. (2014). Toward parallel feature selection from vertically partitioned data. In ESANN Google Scholar
  5. Bolón-Canedo, V., Sánchez-Maroño, N., & Alonso-Betanzos, A. (2015a). Distributed feature selection: An application to microarray data classification. Applied Soft Computing, 30, 136–150.CrossRefGoogle Scholar
  6. Brankovic, A., Falsone, A., Prandini, M., & Piroddi, L. (2018). A feature selection and classification algorithm based on randomized extraction of model populations. IEEE Transactions on Cybernetics, 48(4), 1151–1162.CrossRefGoogle Scholar
  7. Cano, A., Zafra, A., & Ventura, S. (2013). Weighted data gravitation classification for standard and imbalanced data. IEEE Transactions on Cybernetics, 43(6), 1672–1687.CrossRefGoogle Scholar
  8. Chandrashekar, G., & Sahin, F. (2014). A survey on feature selection methods. Computers & Electrical Engineering, 40(1), 16–28.CrossRefGoogle Scholar
  9. Chu, C., Kim, S. K., Lin, Y. A., Yu, Y., Bradski, G., Ng, A. Y., et al. (2007). Map-reduce for machine learning on multicore. Advances in neural information processing systems, 19, 281.Google Scholar
  10. de Souza, J. T., Matwin, S., & Japkowicz, N. (2006). Parallelizing feature selection. Algorithmica, 45(3), 433–456.MathSciNetCrossRefGoogle Scholar
  11. Diao, R., & Shen, Q. (2012). Feature selection with harmony search. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 42(6), 1509–1523.CrossRefGoogle Scholar
  12. Ferri, F., Pudil, P., Hatef, M., & Kittler, J. (1994). Comparative study of techniques for large-scale feature selection. Pattern Recognition in Practice, IV, 403–413.Google Scholar
  13. Guillén, A., Sorjamaa, A., Miche, Y., Lendasse, A., & Rojas, I. (2009). Efficient parallel feature selection for steganography problems. In Bio-inspired systems: Computational and ambient intelligence (pp. 1224–1231).Google Scholar
  14. Guyon, I., & Elisseeff, A. (2006). An introduction to feature extraction. In I. Guyon, M. Nikravesh, S. Gunn, & L. A. Zadeh (Eds.), Feature extraction. Studies in fuzziness and soft computing, (Vol. 207). Berlin, Heidelberg: Springer.
  15. Inza, I., Larrañaga, P., Etxeberria, R., & Sierra, B. (2000). Feature subset selection by bayesian network-based optimization. Artificial Intelligence, 123(1), 157–184.CrossRefGoogle Scholar
  16. Kabir, M. M., Shahjahan, M., & Murase, K. (2012). A new hybrid ant colony optimization algorithm for feature selection. Expert Systems with Applications, 39(3), 3747–3763.CrossRefGoogle Scholar
  17. Kira, K., & Rendell, L. A. (1992). A practical approach to feature selection. In Proceedings of the ninth international workshop on machine learning (pp. 249–256).Google Scholar
  18. Kononenko, I. (1994). Estimating attributes: Analysis and extensions of relief. In European conference on machine learning (pp. 171–182). Springer.Google Scholar
  19. Lin, S. W., & Chen, S. C. (2009). PSOLDA: A particle swarm optimization approach for enhancing classification accuracy rate of linear discriminant analysis. Applied Soft Computing, 9(3), 1008–1015.CrossRefGoogle Scholar
  20. Liu, H., & Motoda, H. (2012). Feature selection for knowledge discovery and data mining (Vol. 454). Springer Science & Business Media.Google Scholar
  21. López, F. G., Torres, M. G., Batista, B. M., Pérez, J. A. M., & Moreno-Vega, J. M. (2006). Solving feature subset selection problem by a parallel scatter search. European Journal of Operational Research, 169(2), 477–489.MathSciNetCrossRefGoogle Scholar
  22. Morán-Fernández, L., Bolón-Canedo, V., & Alonso-Betanzos, A. (2015). A time efficient approach for distributed feature selection partitioning by features. In Conference of the Spanish association for artificial intelligence (pp. 245–254). Springer.Google Scholar
  23. Newman, D., Hettich, S., Blake, C., & Merz, C. (1998). UCI repository of machine learning databases. Retrieved June 28, 2016 from
  24. Piroddi, L., & Spinelli, W. (2003). An identification algorithm for polynomial NARX models based on simulation error minimization. International Journal of Control, 76(17), 1767–1781.MathSciNetCrossRefGoogle Scholar
  25. Prasad, B. R., Bendale, U. K., & Agarwal, S. (2016). Distributed feature selection using vertical partitioning for high dimensional data. In International conference on advances in computing, communications and informatics (ICACCI) (pp. 807–813). IEEEGoogle Scholar
  26. Pudil, P., Ferri, F., Novovicova, J., & Kittler, J. (1994). Floating search methods for feature selection with nonmonotonic criterion functions. In: 12th International conference on pattern recognition (Vol. 2, pp. 279–283).Google Scholar
  27. Smith, M. G., & Bull, L. (2005). Genetic programming with a genetic algorithm for feature construction and selection. Genetic Programming and Evolvable Machines, 6(3), 265–281.CrossRefGoogle Scholar
  28. Sorjamaa, A., Hao, J., Reyhani, N., Ji, Y., & Lendasse, A. (2007). Methodology for long-term prediction of time series. Neurocomputing, 70(16), 2861–2869.CrossRefGoogle Scholar
  29. Sreeja, N., & Sankar, A. (2015). Pattern matching based classification using ant colony optimization based feature selection. Applied Soft Computing, 31, 91–102.CrossRefGoogle Scholar
  30. Xue, B., Zhang, M., & Browne, W. N. (2013). Particle swarm optimization for feature selection in classification: A multi-objective approach. IEEE Transactions on Cybernetics, 43(6), 1656–1671.CrossRefGoogle Scholar
  31. Xue, B., Zhang, M., & Browne, W. N. (2014). Particle swarm optimisation for feature selection in classification: Novel initialisation and updating mechanisms. Applied Soft Computing, 18, 261–276.CrossRefGoogle Scholar
  32. Yang, J., & Honavar, V. (1998). Feature subset selection using a genetic algorithm. In Feature extraction, construction and selection (pp. 117–136). Boston, MA: Springer.Google Scholar
  33. Zhao, Z., Zhang, R., Cox, J., Duling, D., & Sarle, W. (2013). Massively parallel feature selection: An approach based on variance preservation. Machine Learning, 92(1), 195–220.MathSciNetCrossRefGoogle Scholar

Copyright information

© The Author(s), under exclusive licence to Springer Science+Business Media LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Dipartimento di Elettronica, Informazione e BioingegneriaPolitecnico di MilanoMilanItaly

Personalised recommendations