Skip to main content

Feature Selection Methods Based on Decision Rule and Tree Models

  • Conference paper
  • First Online:
Book cover Intelligent Decision Technologies 2016

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 57))

Abstract

Feature selection methods, as a preprocessing step to machine learning, is effective in reducing dimensionality, removing irrelevant data, increasing learning accuracy, and improving result comprehensibility. However, the recent increase of dimensionality of data poses a severe challenge to many existing feature selection methods with respect to efficiency and effectiveness. In this work, a novel concepts of relevant feature selection based on information gathered from decision rule and decision tree models were introduced. A new measures DRQualityImp and DTLevelImp were additionally defined. The first one is based on feature presence frequency and rule quality, while the second is based on feature presence on different levels inside decision tree. The efficiency and effectiveness of that method is demonstrated through the exemplary use of five real-world datasets. Promising initial results of classification efficiency could be gained together with substantial reduction of problem dimensionality.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  1. Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97, 273–324 (1997)

    Article  MATH  Google Scholar 

  2. Bermingham, M.L., Pong-Wong, R., Spiliopoulou, A., Hayward, C., Rudan, I., Campbell, H., Wright, A.F., Wilson, J.F., Agakov, F., Navarro, P., Haley, C.S.: Application of high-dimensional feature selection: evaluation for genomic prediction in man. Sci. Rep. 5, (2015)

    Google Scholar 

  3. Phuong, T.M., Lin, Z., Altman, R.B.: Choosing SNPs using feature selection. In: Proceedings of 2005 IEEE Computational Systems Bioinformatics Conference, CSB 2005, pp. 301–309 (2005)

    Google Scholar 

  4. Paja, W., Wrzesien, M., Niemiec, R., Rudnicki, W.R.: Application of all-relevant feature selection for the failure analysis of parameter-induced simulation crashes in climate models. Geosci. Model Dev. 9, 1065–1072 (2016)

    Article  Google Scholar 

  5. Zhu, Z., Ong, Y.S., Dash, M.: Wrapper-filter feature selection algorithm using a memetic framework. IEEE Trans. Syst. Man, Cybern. Part B Cybern. 37, 70–76 (2007)

    Google Scholar 

  6. Nilsson, R., Peña, J.M., Björkegren, J., Tegnér, J.: Detecting multivariate differentially expressed genes. BMC Bioinf. 8, 150 (2007)

    Article  Google Scholar 

  7. Rudnicki, W.R., Wrzesień, M., Paja, W.: All Relevant feature selection methods and applications. In: Stańczyk, U., Lakhmi, C.J. (eds.) Feature Selection for Data and Pattern Recognition, pp. 11–28. Springer-Verlag, Berlin Heidelberg, Berlin (2015)

    Chapter  Google Scholar 

  8. Greco, S., Słowinski, R., Stefanowski, J.: Evaluating importance of conditions in the set of discovered rules. In: RSFDGrC’07: Proceedings of the 11th International Conference on Rough Sets, Fuzzy Sets, Data Mining and Granular Computing, Toronto, Ontario, Canada, pp. 314–321 (2007)

    Google Scholar 

  9. Sikora, M., Gruca, A.: Quality improvement of rules based gene groups descriptions using information about GO terms importance occurring in premises of determined rules. Int. J. Appl. Math. Comput. Sci. 20(3), 555–570 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  10. Stoppiglia, H., Dreyfus, G., Dubois, R., Oussar, Y.: Ranking a random feature for variable and feature selection. J. Mach. Learn. Res. 3, 1399–1414 (2003)

    MATH  Google Scholar 

  11. Tuv, E., Borisov, A., Torkkola, K.: Feature selection using ensemble based ranking against artificial contrasts. In: International Symposium on Neural Networks, pp. 2181–2186 (2006)

    Google Scholar 

  12. Fawcett, T.: An introduction to ROC analysis. Pattern Recogn. Lett. 27, 861–874 (2006)

    Article  Google Scholar 

  13. Hippe, Z.S., Bajcar, S., Blajdo, P., Grzymala-Busse, J.P., Grzymala-Busse, J.W., Knap, M., Paja, W., Wrzesien, M.: Diagnosing skin melanoma: current versus future directions. TASK Q. 7, 289–293 (2003)

    Google Scholar 

  14. Hernández-Orallo, J., Flach, P., Ferri, C.: A unified view of performance metrics: translating threshold choice into expected classification loss. J. Mach. Learn. Res. 13, 2813–2869 (2012)

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

This work was supported by the Center for Innovation and Transfer of Natural Sciences and Engineering Knowledge at the University of Rzesz̀w.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wiesław Paja .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Paja, W. (2016). Feature Selection Methods Based on Decision Rule and Tree Models. In: Czarnowski, I., Caballero, A.M., Howlett, R.J., Jain, L.C. (eds) Intelligent Decision Technologies 2016. Smart Innovation, Systems and Technologies, vol 57. Springer, Cham. https://doi.org/10.1007/978-3-319-39627-9_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-39627-9_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-39626-2

  • Online ISBN: 978-3-319-39627-9

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics