Skip to main content

Unsupervised Feature Selection Using Correlation Score

  • Conference paper
  • First Online:
Book cover Computing, Communication and Signal Processing

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 810))

Abstract

Data of huge dimensionality is generated because of wide application of technologies. Using this data for the very purpose of decision-making is greatly affected because of the curse of dimensionality as selection of all features will lead to overfitting and ignoring the relevant ones can lead to information loss. Feature selection algorithms help to overcome this problem by identifying the subset of original features by retaining relevant features and by removing the redundant ones. This paper aims to evaluate and analyze some of the most popular feature selection algorithms using different benchmarked datasets. Relief, ReliefF, and Random Forest algorithms are evaluated and analyzed in the form of combinations of different rankers and classifiers. It is observed empirically that the accuracy of the ranker and classifier varies from dataset to dataset. This paper introduces the concept of applying multivariate correlation analysis (MCA) for feature selection. From results, it can be inferred that MCA exhibits better performance over the legacy-based feature selection algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Pattanshetti, T., Attar, V.: Survey of performance modeling of big data applications. In: 7th IEEE Conference on Cloud Computing, Data Science and Engineering, Confluence (2017)

    Google Scholar 

  2. Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. 1157–82 (2003)

    Google Scholar 

  3. Chandrashekar, G., Sahin, F.: A Survey on Feature Selection Methods, vol. 40, pp. 16–28. Elsevier (2013)

    Google Scholar 

  4. Genuer, R., Poggi, J.-M., Tuleau-Malot, C.: Variable Selection using Random Forest. 31, 2225–223, (2010)

    Article  Google Scholar 

  5. Mitra, P., Murthy, C., Pal, S.K.: Unsupervised feature selection using feature similarity. IEEE Trans. Pattern Anal. Mach. Intell. 24, 301–312 (2002)

    Article  Google Scholar 

  6. Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97, 273–324 (1997)

    Article  Google Scholar 

  7. Kira, K., Rendell, L.A.: A practical approach to feature selection. In: 9th International Conference on Machine Learning, pp. 249–256 (1999)

    Chapter  Google Scholar 

  8. Gilad-Bachrach, R., Navot, A., Tishby, N.: Margin based feature selection—theory and algorithms. In: 21st International Conference on Machine Learning (2004)

    Google Scholar 

  9. Sun, Yijun: Iterative RELIEF for feature weighting: algorithms, theories, and applications. IEEE Trans. Pattern Anal. Mach. Intell. 29, 6 (2007)

    Article  Google Scholar 

  10. Kononenko, I.: Estimating Attributes: Analysis and Extensions of RELIEF European Conference on Machine Learning, vol. 784, pp. 171–182(1994)

    Chapter  Google Scholar 

  11. Yu, L., Liu, H.: Feature selection for high-dimensional data: a fast co-relation-based filter solution. In: Proceedings of the Twentieth International Conference on Machine Learning (2003)

    Google Scholar 

  12. Duch, W., Biesiada, J.: Feature selection for high-dimensional data: a kolmogorov-smirnov co-relation-based filter solution. Advances in Soft Computing, pp. 95–104. Springer (2005)

    Google Scholar 

  13. Refaeilzadeh, P., Tang, L., Liu, H.: On Comparison of Feature Selection Algorithms WS-07-05, 34-39 (2003)

    Google Scholar 

  14. Chi, J.: Entropy based feature evaluation and selection technique. In: Proceedings of 4th Australian Conference on Neural Networks. ACNN (1993)

    Google Scholar 

  15. Statnikov, A., Aliferis, C., Tsamardinos, I., Hardin, D., Levy, S.: A comprehensive evaluation of multi-category classification methods for microarray gene expression cancer diagnosis. Bioinformatics 21, 631–643 (2005)

    Article  Google Scholar 

  16. Wang, S., Tang, J., Liu, H.: Embedded Unsupervised Feature Selection, Association for the Advancement of Artificial Intelligence (2015)

    Google Scholar 

  17. Li, J., Hu, X., Tang, J., Liu, H.: Unsupervised Streaming Feature Selection in Social Media, CIKM’15. ACM, Melbourne, Australia (2015)

    Google Scholar 

  18. Weather forecast dataset link. https://nomads.ncdc.noaa.gov/data/gfsanl/

  19. Breast cancer datasetlink. http://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Diagnostic%29

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tanuja Pattanshetti .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Pattanshetti, T., Attar, V. (2019). Unsupervised Feature Selection Using Correlation Score. In: Iyer, B., Nalbalwar, S., Pathak, N. (eds) Computing, Communication and Signal Processing . Advances in Intelligent Systems and Computing, vol 810. Springer, Singapore. https://doi.org/10.1007/978-981-13-1513-8_37

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-1513-8_37

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-1512-1

  • Online ISBN: 978-981-13-1513-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics