Skip to main content

Finding the Optimal Number of Features Based on Mutual Information

  • Conference paper
  • First Online:

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 641))

Abstract

For high dimensional data analytics, feature selection is an indispensable preprocessing step to reduce dimensionality and keep the simplicity and interpretability of models. This is particularly important for fuzzy modeling since fuzzy models are widely recognized for their transparency and interpretability. Despite the substantial work on feature selection, there is little research on determining the optimal number of features for a task. In this paper, we propose a method to help find the optimal number of feature effectively based on mutual information.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Alonso, J.M., Castiello, C., Mencar, C.: Interpretability of fuzzy systems: current research trends and prospects. In: Springer Handbook of Computational Intelligence, pp. 219–237. Springer, Berlin (2015)

    Google Scholar 

  2. Alpaydin, E.: Introduction to Machine Learning. MIT press, Cambridge (2014)

    Google Scholar 

  3. Battiti, R.: Using mutual information for selecting features in supervised neural net learning. IEEE Trans. Neural Netw. 5(4), 537–550 (1994)

    Article  Google Scholar 

  4. Bolón-Canedo, V., Sánchez-Maroño, N., Alonso-Betanzos, A.: A review of feature selection methods on synthetic data. Knowl. Inf. Syst. 34(3), 483–519 (2013)

    Article  Google Scholar 

  5. Chandrashekar, G., Sahin, F.: A survey on feature selection methods. Comput. Electr. Eng. 40(1), 16–28 (2014)

    Article  Google Scholar 

  6. Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, New York (2012)

    Google Scholar 

  7. Dash, M., Liu, H.: Consistency-based search in feature selection. Artif. Intell. 151(1–2), 155–176 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  8. Gaspar-Cunha, A., Recio, G., Costa, L., Estébanez, C.: Self-adaptive MOEA feature selection for classification of bankruptcy prediction data. Sci. World J. 2014, 20 (2014)

    Google Scholar 

  9. Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3(Mar), 1157–1182 (2003)

    Google Scholar 

  10. Guyon, I., Gunn, S., Nikravesh, M., Zadeh, L.A.: Feature Extraction: Foundations and Applications, vol. 207. Springer, Heidelberg (2008)

    Google Scholar 

  11. Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Mach. Learn. 46(1), 389–422 (2002)

    Article  MATH  Google Scholar 

  12. Hall, M.A.: Correlation-based feature selection for machine learning. Ph.D. thesis, The University of Waikato (1999)

    Google Scholar 

  13. Hall, M.A., Smith, L.A.: Practical feature subset selection for machine learning (1998)

    Google Scholar 

  14. Huang, J., Ling, C.X.: Using AUC and accuracy in evaluating learning algorithms. IEEE Trans. Knowl. Data Eng. 17(3), 299–310 (2005)

    Article  Google Scholar 

  15. Hughes, G.: On the mean accuracy of statistical pattern recognizers. IEEE Trans. Inf. Theor. 14(1), 55–63 (1968)

    Article  Google Scholar 

  16. Jang, J.S.R., Sun, C.T., Mizutani, E.: Neuro-Fuzzy and Soft Computing, a Computational Approach to Learning and Machine Intelligence. Prentice Hall, Upper Saddle River (1997)

    Google Scholar 

  17. Kaymak, U., Ben-David, A., Potharst, R.: The AUK: a simple alternative to the AUC. Eng. Appl. Artif. Intell. 25(5), 1082–1089 (2012)

    Article  Google Scholar 

  18. Khan, A., Baig, A.R.: Multi-objective feature subset selection using non-dominated sorting genetic algorithm. J. Appl. Res. Technol. 13(1), 145–159 (2015)

    Article  Google Scholar 

  19. Peng, H., Long, F., Ding, C.: Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005)

    Article  Google Scholar 

  20. Pohjalainen, J., Räsänen, O., Kadioglu, S.: Feature selection methods and their combinations in high-dimensional classification of speaker likability, intelligibility and personality traits. Comput. Speech Lang. 29(1), 145–171 (2015)

    Article  Google Scholar 

  21. Setnes, M., Kaymak, U.: Fuzzy modeling of client preference from large data sets: an application to target selection in direct marketing. IEEE Trans. Fuzzy Syst. 9(1), 153–163 (2001)

    Article  Google Scholar 

  22. Wilbik, A., van Loon, S., Boer, A.K., Kaymak, U., Scharnhorst, V.: Fuzzy modeling for vitamin b12 deficiency. In: International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, pp. 462–471. Springer (2016)

    Google Scholar 

  23. Xue, B., Fu, W., Zhang, M.: Multi-objective feature selection in classification: a differential evolution approach. In: Asia-Pacific Conference on Simulated Evolution and Learning, pp. 516–528. Springer (2014)

    Google Scholar 

Download references

Acknowledgement

This work is partially supported by Philips Research within the scope of the BrainBridge Program.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peipei Chen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this paper

Cite this paper

Chen, P., Wilbik, A., van Loon, S., Boer, AK., Kaymak, U. (2018). Finding the Optimal Number of Features Based on Mutual Information. In: Kacprzyk, J., Szmidt, E., Zadrożny, S., Atanassov, K., Krawczak, M. (eds) Advances in Fuzzy Logic and Technology 2017. EUSFLAT IWIFSGN 2017 2017. Advances in Intelligent Systems and Computing, vol 641. Springer, Cham. https://doi.org/10.1007/978-3-319-66830-7_43

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-66830-7_43

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-66829-1

  • Online ISBN: 978-3-319-66830-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics