Advertisement

A Hybrid-Based Feature Selection Approach for IDS

  • AmritaEmail author
  • P. Ahmed
Conference paper
Part of the Lecture Notes in Electrical Engineering book series (LNEE, volume 284)

Abstract

An intrusion detection (ID) technique classifies the incoming network traffic, represented as a feature vector, into anomalous or normal traffic by a classification method. In practice, it has been observed that the high dimensionality of the feature vector degrades classification performance. To reduce the dimensionality, without compromising the performance, a new hybrid feature selection method has been introduced and its performance is measured on KDD Cup’99 dataset by the classifiers Naïve Bayes and C4.5. Three sets of experiments have been conducted using full feature set, reduced sets of features obtained using four well known feature selection methods as Correlation-based Feature Selection (CFS), Consistency-based Feature Selection (CON), Information Gain (IG), Gain Ratio (GR) and the proposed method on the said dataset and classifiers. In first experiment, classifier Naïve Bayes and C4.5 yielded classification accuracy 97.5 % and 99.8 % respectively. In second set of experiments, the best performance (accuracy) of these classifiers was achieved as 99.1 % and 99.8 % by the method IG. In third experiment, six features are obtained using proposed method and noted the same as 99.4 % and 99.9 %. The proposed hybrid feature selection method outperformed earlier mentioned methods on various metrics.

Keywords

Feature Selection Information Gain True Positive Rate Feature Subset Feature Selection Method 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    P. Mitra et al., Unsupervised feature selection using feature similarity. IEEE Trans. Pattern Anal. Mach. Intell. 24, 301–312 (2002)CrossRefGoogle Scholar
  2. 2.
    W. Wang et al., Towards fast detecting intrusions: using key attributes of network traffic, in The Third International Conference on Internet Monitoring and Protection, 2008, pp. 86–91Google Scholar
  3. 3.
    Y. Chen et al., Building lightweight intrusion detection system based on principal component analysis and C4.5 algorithm, in ICACT2007, 2007, pp. 2109–2112Google Scholar
  4. 4.
    J. Park, K.M. Shazzad, D. Kim, Toward modeling lightweight intrusion detection system through correlation-based hybrid feature selection, in Proceedings of Information Security and Cryptology, Lecture Notes in Computer Science, Vol. 3822, 2005, pp. 279–289Google Scholar
  5. 5.
    D. Kim, H.N. Nguyen, S.Y. Ohn, J. Park, Fusions of GA and SVM for Anomaly detection in intrusion detection system, in Proceedings of 2nd International Symposium on Neural Networks, Lecture Notes in Computer Science, Vol. 3498, 2005, pp. 415–420Google Scholar
  6. 6.
    K.M. Shazzad, J.S. Park, Optimization of intrusion detection through fast hybrid feature selection, in Proceedings of 6th International Conference on Parallel and Distributed Computing, Applications and Technologies, 2005Google Scholar
  7. 7.
    A.L. Blum, P. Langley, Selection of relevant features and examples in machine learning. Artif. Intell. 97(1–2), 245–271 (1997)CrossRefzbMATHMathSciNetGoogle Scholar
  8. 8.
    H. Liu, H. Motoda, Feature Selection for Knowledge Discovery and Data Mining (Kluwer, Boston, 1998)CrossRefzbMATHGoogle Scholar
  9. 9.
    R. Kohavi, G. John, Wrappers for feature subset selection. Artif. Intell. 97(1–2), 273–324 (1997)CrossRefzbMATHGoogle Scholar
  10. 10.
    S. Das, Filters, wrappers and a boosting-based hybrid for feature selection, in Proceedings of 18th International Conference on Machine Learning, 2001, pp. 74–81Google Scholar
  11. 11.
    A. Amrita, P. Ahmed, A study of feature selection methods in intrusion detection system: a survey. Int. J. Comput. Sci. Eng. Info. Technol. Res. (IJCSEITR) 2(3), 1–25 (2012)Google Scholar
  12. 12.
    M.A. Hall, Correlation-based feature selection for discrete and numeric class machine learning, in Proceedings of 17th International Conference on Machine Learning, 2000, pp. 359–366Google Scholar
  13. 13.
    M. Dash. H. Liu, Consistency-based search in feature selection. Artif. Intell. 151(1–2), 155–176 (2003). http://dx.doi.org/10.1016/s0004-3702(03)00079-1
  14. 14.
    T.M. Mitchell, Machine Learning (Mc-Graw-Hill, New York, 1997)zbMATHGoogle Scholar
  15. 15.
    J.R. Quinlan, Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)Google Scholar
  16. 16.
    H. Zhang, The optimality of naive Bayes, in The 17th International FLAIRS Conference, Miami Beach, 2004, pp. 17–19Google Scholar
  17. 17.
    J.R. Quinlan, C4.5: Programs for Machine Learning (Morgan Kaufmann, San Mateo, 1993)Google Scholar
  18. 18.
    KDD Cup 1999 Intrusion detection dataset. http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html
  19. 19.
    S. Mukkamala et al., Intrusion detection using an ensemble of intelligent paradigms. J. Netw. Comput. Appl. 28(2), 167–182 (2005)CrossRefGoogle Scholar
  20. 20.
    Waikato environment for knowledge analysis (weka) version 3.7.9. Available on: http://www.cs.waikato.ac.nz/ml/weka/
  21. 21.
    U.M. Fayyad, K.B. Irani, Multi-interval discretization of continuous valued attributes for classification learning, in Proceedings of the 13th International Joint Conference on Artificial Intelligence (IJCAI), 1993, pp. 1022–1029Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.Department of CSE, SETSharda UniversityGreater NoidaIndia

Personalised recommendations