Advertisement

Information Transmission and Nonspecificity in Feature Selection

  • Pasi LuukkaEmail author
  • Christoph Lohrmann
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 1000)

Abstract

In this paper we propose a novel feature selection method which is based on fuzzy measures. More specifically, we apply a similarity measure to form similarity matrices from the data and apply nonspecificity on similarity degrees in order to conduct feature selection. To measure how relevant a particular feature is, we apply an information transmission measure. We exemplify our method on a simple artificial case to demonstrate its ability to select informative features. Moreover, we test our method on two real world data sets, the chronic kidney disease and the diabetic retinopathy Debrecen dataset. The nonspecificity-based feature selection method leads for both datasets to improvements in the mean classification performance. In comparison with the popular ReliefF algorithm and the Fisher Score, the new method reaches competitive results and also accomplishes the highest mean accuracy for both datasets.

References

  1. 1.
    Antal, B., Hajdu, A.: Diabetic retinopathy debrecen data set (2014). https://archive.ics.uci.edu/ml/datasets/Diabetic+Retinopathy+Debrecen+Data+Set
  2. 2.
    Bandemer, H., Näther, W.: Fuzzy Data Analysis. Kluwer Academic Publishing, Norwell (1992)CrossRefGoogle Scholar
  3. 3.
    Blum, A., Langley, P.: Selection of relevant features and examples in machine learning. Artif. Intell. 97, 245–271 (1997)MathSciNetCrossRefGoogle Scholar
  4. 4.
    Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley, Hoboken (2012)Google Scholar
  5. 5.
    Fisher, R.A.: The use of multiple measurements in taxonomic problems. Ann. Eugenics 7(2), 179–188 (1936)CrossRefGoogle Scholar
  6. 6.
    Hartley, R.V.L.: Transmission of information. Bell Syst. Tech. J. 8(3), 535–563 (1928)CrossRefGoogle Scholar
  7. 7.
    Jolliffe, I.: Principal Component Analysis. Springer, Heidelberg (1986)CrossRefGoogle Scholar
  8. 8.
    Kira, K., Rendell, L.A.: A practical approach to feature selection. In: Proceedings of the Ninth International Workshop on Machine Learning (1992).  https://doi.org/10.1016/S0031-3203(01)00046-2CrossRefGoogle Scholar
  9. 9.
    Klir, G.J., Yuan, B.: Fuzzy Sets and Fuzzy Logic, Theory and Applications. Prentice Hall, Upper Saddle River (1995)Google Scholar
  10. 10.
    Kononenko, I., Simec, E., Robnik-Sikonja, M.: Overcoming the myopia of inductive learning Algorithms with RELIEFF. Appl. Intell. 7, 39–55 (1997)CrossRefGoogle Scholar
  11. 11.
    Lichman, M.: UCI Machine Learning Repository (2013). Accessed 5 Nov 2018. http://archive.ics.uci.edu/ml
  12. 12.
    Lohrmann, C., Luukka, P., Jablonska-Sabuka, M., Kauranne, T.: A combination of fuzzy similarity measures and fuzzy entropy measures for supervised feature selection. Expert Syst. Appl. 110, 216–236 (2018)CrossRefGoogle Scholar
  13. 13.
    Łukasiewicz, J.: Selected Work. Cambridge University Press, Cambridge (1970)Google Scholar
  14. 14.
    Luukka, P.: Feature selection using fuzzy entropy measures with similarity classifiers. Expert Syst. Appl. 38, 4600–4607 (2011)CrossRefGoogle Scholar
  15. 15.
    Luukka, P., Saastamoinen, K., Könönen, V.: A classifier based on the maximal fuzzy similarity in the generalized Łukasiewicz-structure. In: Proceedings of 10th IEEE International Conference on Fuzzy Systems (2001)Google Scholar
  16. 16.
    McLachlan, G.J.: Discriminant Analysis and Statistical Pattern Recognition. Wiley Interscience (2004)Google Scholar
  17. 17.
    Robnik-Sikonja, M., Kononenko, I.: Theoretical and empirical analysis of ReliefF and RReliefF. Mach. Learn. 53(1–2), 23–69 (2003).  https://doi.org/10.1023/A:1025667309714CrossRefzbMATHGoogle Scholar
  18. 18.
    Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27(3), 379–423 (1948)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Soundarapandian, P., Rubini, L.: Chronic Kidney Disease Data Set (2015)Google Scholar
  20. 20.
    Vergara, J.R., Estevez, P.A.: A review of feature selection methods based on mutual information. Neural Comput. Appl. 24, 175–186 (2014)CrossRefGoogle Scholar
  21. 21.
    Zadeh, L.: Fuzzy sets. Inf. Control 8(3), 338–353 (1965)CrossRefGoogle Scholar
  22. 22.
    Zadeh, L.: Similarity relations and fuzzy orderings. Inf. Sci. 3(1), 177–200 (1971)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.School of Business and ManagementLappeenranta University of TechnologyLappeenrantaFinland

Personalised recommendations