Skip to main content

A Novel LtR and RtL Framework for Subset Feature Selection (Reduction) for Improving the Classification Accuracy

  • Conference paper
  • First Online:
Progress in Advanced Computing and Intelligent Engineering

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 713))

Abstract

Preprocessing is one of the data mining steps after data collection. There are several issues need to be addressed in preprocessing stage of data mining. One among them is feature selection (FS) or feature reduction (FR). There are several approaches available for handling issues of FS and FR. Those methods are categorized as filter, wrapper, and embedded modes. In this research, we introduce a novel filter-based feature selection framework called LtR (left to right) and RtL (right to left) based on symmetrical uncertainty (SU). Our method generates K-subset of features such that each subset has the finite number of unique features in it. Each subset is analyzed using various classifiers (Jrip, OneR, Ridor, J48, SimpleCart, Naive Bayes, IBk) and compared with the existing filter-based FS methods: information gain (IG), ReliefF (Rel), chi-squared attribute evaluator (Chi), and gain ratio attribute evaluator (GR). Experimental analysis revealed that minimum one of the subsets performs better than some of the existing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Witten, I.H., Frank, E., Hall, M.A., Pal, C.J.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann (2016)

    Google Scholar 

  2. Goswami, S., Chakrabarti, A.: Feature selection: a practitioner view. Int. J. Inf. Technol. Comput. Sci. 6, 66–77 (2014). https://doi.org/10.5815/ijitcs.2014.11.10

    Article  Google Scholar 

  3. Amarnath, B., Balamurugan, S., Alias, A.: Review on feature selection techniques and its impact for effective data classification using uci machine learning repository dataset. J. Eng. Sci. Technol. 11, 1639–1646 (2016)

    Google Scholar 

  4. Tang, J., Alelyani, S., Liu, H.: Feature selection for classification: a review. Data Classificat. Algor. Appli. 37 (2014)

    Google Scholar 

  5. Chandrashekar, G., Sahin, F.: A survey on feature selection methods. Comput. Electr. Eng. 40, 16–28 (2014). https://doi.org/10.1016/j.compeleceng.2013.11.024

    Article  Google Scholar 

  6. Kumar, V.: Feature selection: a literature review. Smart Comput. Rev. 4. https://doi.org/10.6029/smartcr.2014.03.007

  7. Singh, B., Kushwaha, N., Vyas, O.P.: A feature subset selection technique for high dimensional data using symmetric uncertainty. J. Data Anal. Inf. Process. 02, 95–105 (2014). https://doi.org/10.4236/jdaip.2014.24012

    Article  Google Scholar 

  8. Song, Qinbao, Ni, Jingjie, Wang, Guangtao: A fast clustering-based feature subset selection algorithm for high-dimensional data. IEEE Trans. Knowl. Data Eng. 25, 1–14 (2013). https://doi.org/10.1109/TKDE.2011.181

    Article  Google Scholar 

  9. Jalilvand, A., Salim, N.: Feature unionization: a novel approach for dimension reduction. Appl. Soft Comput. 52, 1253–1261 (2017). https://doi.org/10.1016/j.asoc.2016.08.031

    Article  Google Scholar 

  10. Cesur, R., Ceyhan, E.B., Kermen, A., Sağıroğlu, Ş.: Determination of potential criminals in social network. Gazi Univ. J. Sci. 30, 121–131 (2017)

    Google Scholar 

  11. Mangai, J.A., Santhosh Kumar, V., Appavu alias Balamurugan, S.: A novel feature selection framework for automatic web page classification. Int. J. Automat. Comput. 9, 442–448. https://doi.org/10.1007/s11633-012-0665-x (2012)

    Article  Google Scholar 

  12. Liu, C., Wang, W., Zhao, Q., Shen, X., Konan, M.: A new feature selection method based on a validity index of feature subset. Pattern Recogn. Lett. 92, 1–8 (2017). https://doi.org/10.1016/j.patrec.2017.03.018

    Article  Google Scholar 

  13. Osanaiye, O., Cai, H., Choo, K.-K.R., Dehghantanha, A., Xu, Z., Dlodlo, M.: Ensemble-based multi-filter feature selection method for DDoS detection in cloud computing. EURASIP J. Wirel. Commun. Netw. https://doi.org/10.1186/s13638-016-0623-3 (2016)

  14. Silwattananusarn, T., Kanarkard, W., Tuamsuk, K.: Enhanced classification accuracy for cardiotocogram data with ensemble feature selection and classifier ensemble. J. Comput. Commun. 04, 20–35 (2016). https://doi.org/10.4236/jcc.2016.44003

    Article  Google Scholar 

  15. Piao, Y., Piao, M., Park, K., Ryu, K.H.: An ensemble correlation-based gene selection algorithm for cancer classification with gene expression data. Bioinformatics 28, 3306–3315 (2012). https://doi.org/10.1093/bioinformatics/bts602

    Article  Google Scholar 

  16. Bolón-Canedo, V., Sánchez-Maroño, N., Alonso-Betanzos, A., Benítez, J.M., Herrera, F.: A review of microarray datasets and applied feature selection methods. Inf. Sci. 282, 111–135 (2014). https://doi.org/10.1016/j.ins.2014.05.042

    Article  Google Scholar 

  17. Yu, L., Liu, H.: Efficient feature selection via analysis of relevance and redundancy. J. Mach. Learn. Res. 5, 1205–1224 (2004)

    MathSciNet  MATH  Google Scholar 

  18. Yu, L., Liu, H.: Efficient feature selection via analysis of relevance and redundancy. J. Mach. Learn. Res. 5, 1205–1224 (2004)

    MathSciNet  MATH  Google Scholar 

  19. Patil, P., Attar, V.: Intelligent detection of major network attacks using feature selection methods. In: Proceedings of the International Conference on Soft Computing for Problem Solving (SocProS 2011) December 20–22, 2011. Springer, pp. 671–679 (2012)

    Google Scholar 

  20. Potharaju, S.P., Sreedevi, M.: Ensembled rule based classification algorithms for predicting imbalanced kidney disease data. J. Eng. Sci. Technol. Rev. 9(5), 201–207 (2016)

    Article  Google Scholar 

  21. https://archive.ics.uci.edu/ml/machine-learning-databases/dermatology/

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sai Prasad Potharaju .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Potharaju, S.P., Sreedevi, M. (2019). A Novel LtR and RtL Framework for Subset Feature Selection (Reduction) for Improving the Classification Accuracy. In: Pati, B., Panigrahi, C., Misra, S., Pujari, A., Bakshi, S. (eds) Progress in Advanced Computing and Intelligent Engineering. Advances in Intelligent Systems and Computing, vol 713. Springer, Singapore. https://doi.org/10.1007/978-981-13-1708-8_20

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-1708-8_20

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-1707-1

  • Online ISBN: 978-981-13-1708-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics