Skip to main content
Log in

EDDE–LNS: a new hybrid ensemblist approach for feature selection

  • Regular Research Paper
  • Published:
Memetic Computing Aims and scope Submit manuscript

Abstract

Feature selection is the process of selecting a subset of relevant, non-redundant features from the original ones. It is an NP-hard combinatorial optimization problem. In this paper, we propose a new feature selection method, abbreviated as EDDE–LNS, using a combination of large neighbourhood search (LNS) and a new Ensemblist Discrete Differential Evolution (EDDE). Each solution of the search space represents a feature subset of predefined size K. EDDE–LNS explores this search space by evolving a population of individuals in two phases. During the first phase, the LNS strategy is used to improve each feature subset by alternately destroying and repairing it. The proposed accuracy rate difference measure is used to determine irrelevant and redundant features that are removed during the application of the destruction process. In the second phase, the individuals resulting from the application of LNS are used as inputs to the proposed EDDE approach. EDDE is a discrete algorithm inspired by the differential evolution (DE) method. Whereas the original DE method attempts to find the best feature subset in a multidimensional space by applying simple and fast arithmetic operators to each dimension (feature) separately, the EDDE approach proposed in this paper attempts to find the best feature subset in a single dimension space by applying new ensemblist operators to a set of K features. In this way, EDDE will consider the possible interactions between features. Experiments are conducted on intrusion detection and other machine learning datasets. The results indicate that the proposed approach is able to achieve good accuracies in comparison with other well-known feature selection methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Ahmad I, Hussain M, Alghamdi A, Alelaiwi A (2014) Enhancing svm performance in intrusion detection using optimal feature subset selection based on genetic principal components. Neural Comput Appl 24:1671–1682

    Article  Google Scholar 

  2. Al-Ani A, Alsukker A, Khushaba RN (2013) Feature subset selection using differential evolution and a wheel based search strategy. Swarm Evol Comput 9:15–26

    Article  Google Scholar 

  3. Bolón-Canedo V, Sánchez-Maroño N, Alonso-Betanzos A (2016) Feature selection for high-dimensional data. Prog Artif Intell 5:65–75

    Article  Google Scholar 

  4. Brauckhoff D, Salamatian K, May M (2010) A signal processing view on packet sampling and anomaly detection. In: 2010 Proceedings of the IEEE INFOCOM, pp 1–9

  5. Cover TM, Thomas JA (2006) Elements of information theory (Wiley series in telecommunications and signal processing). Wiley, New York

    Google Scholar 

  6. Eesa AS, Orman Z, Brifcani AMA (2015) A novel feature-selection approach based on the cuttlefish optimization algorithm for intrusion detection systems. Expert Syst Appl 42:2670–2679

    Article  Google Scholar 

  7. Fayyad UM, Irani KB (1992) On the handling of continuous-valued attributes in decision tree generation. Mach Learn 8:87–102

    MATH  Google Scholar 

  8. Forsati R, Moayedikia A, Safarkhani B (2011) Heuristic approach to solve feature selection problem. Springer, Berlin

    Book  Google Scholar 

  9. Forsati R, Moayedikia A, Jensen R, Shamsfard M, Meybodi MR (2014) Enriched ant colony optimization and its application in feature selection. Neurocomputing 142:354–371

    Article  Google Scholar 

  10. Forsati R, Moayedikia A, Keikha A (2012) A novel approach for feature selection based on the bee colony optimization. Int J Comput Appl 43:13–16

    Article  Google Scholar 

  11. García S, Fernández A, Luengo J, Herrera F (2010) Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power. Inf Sci 180:2044–2064 Special Issue on Intelligent Distributed Information Systems

    Article  Google Scholar 

  12. Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. ACM SIGKDD Explor Newsl 11:10–18

    Article  Google Scholar 

  13. Kabir MM, Shahjahan M, Murase K (2011) A new local search based hybrid genetic algorithm for feature selection. Neurocomputing 74:2914–2928

    Article  Google Scholar 

  14. Karegowda A, Manjunath AS, Jayaram MA (2010) A comparative study of attribute selection using gain ratio and correlation based feature selection. Inf Technol Knowl Manag 2:271–277

    Google Scholar 

  15. Kashan MH, Nahavandi N, Kashan AH (2012) Disabc: a new artificial bee colony algorithm for binary optimization. Appl Soft Comput 12:342–352

    Article  Google Scholar 

  16. Khushaba RN, Al-Ani A, Al-Jumaily A (2011) Feature subset selection using differential evolution and a statistical repair mechanism. Expert Syst Appl 38:11515–11526

    Article  Google Scholar 

  17. Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97:273–324

    Article  MATH  Google Scholar 

  18. Kdd cup 99 intrusion detection dataset description (1999) http://kdd.ics.uci.edu/databases/kddcup99/kddcup99

  19. Liu H, Setiono R (1995) Chi2: Feature selection and discretization of numeric attributes. In: Proceedings of the seventh international conference on tools with artificial intelligence, pp 388–391

  20. Marinaki M, Marinakis Y (2015) A hybridization of clonal selection algorithm with iterated local search and variable neighborhood search for the feature selection problem. Memet Comput 7:181–201

    Article  Google Scholar 

  21. Moayedikia A, Jensen R, Wiil UK, Forsati R (2015) Weighted bee colony algorithm for discrete optimization problems with application to feature selection. Eng Appl Artif Intell 44:153–167

    Article  Google Scholar 

  22. Nekkaa M, Boughaci D (2015) A memetic algorithm with support vector machine for feature selection and classification. Memet Comput 7:59–73

    Article  Google Scholar 

  23. Pisinger D, Ropke S (2010) Large neighborhood search. Springer, Boston

    Book  MATH  Google Scholar 

  24. Price K, Storn RM, Lampinen JA (2005) Differential evolution: a practical approach to global optimization (natural computing series). Springer, Secaucus

    MATH  Google Scholar 

  25. Quinlan J R (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Francisco

    Google Scholar 

  26. Sabhnani M, Serpen G (2004) Why machine learning algorithms fail in misuse detection on kdd intrusion detection data set. Intell Data Anal 8:403–415

    Google Scholar 

  27. Tran B, Xue B, Zhang M (2016) Genetic programming for feature construction and selection in classification on high-dimensional data. Memet Comput 8:3–15

    Article  Google Scholar 

  28. Unler A, Murat A (2010) A discrete particle swarm optimization method for feature selection in binary classification problems. Eur J Oper Res 206:528–539

    Article  MATH  Google Scholar 

  29. Xue B, Zhang M, Browne WN, Yao X (2015) A survey on evolutionary computation approaches to feature selection. IEEE Trans Evol Comput 20:606–626

    Article  Google Scholar 

  30. Zhu Z, Ong Y-S, Dash M (2007) Wrapper-filter feature selection algorithm using a memetic framework. IEEE Trans Syst Man Cybern B 37:70–76

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank the associate editor and anonymous reviewers for their valuable comments that have significantly helped to improve the paper quality. They would like also to thank Prof. Ahmed Guessoum for his professional proofreading that has greatly helped to improve the readability of the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wassila Guendouzi.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Guendouzi, W., Boukra, A. EDDE–LNS: a new hybrid ensemblist approach for feature selection. Memetic Comp. 10, 63–79 (2018). https://doi.org/10.1007/s12293-017-0226-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12293-017-0226-5

Keywords

Navigation