Skip to main content

Dimension Reduction and Feature Selection

  • Chapter
  • First Online:
Data Mining and Knowledge Discovery Handbook

Summary

Data Mining algorithms search for meaningful patterns in raw data sets. The Data Mining process requires high computational cost when dealing with large data sets. Reducing dimensionality (the number of attributed or the number of records) can effectively cut this cost. This chapter focuses a pre-processing step which removes dimension from a given data set before it is fed to a data mining algorithm. This work explains how it is often possible to reduce dimensionality with minimum loss of information. Clear dimension reduction taxonomy is described and techniques for dimension reduction are presented theoretically.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 349.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Aha, D.W. and Blankert, R. L. Feature selection for case- based classification of cloud types. In Working Notes of th AAAI- 94Workshop on Case- Based Reasoning, pages 106–112, 1994.

    Google Scholar 

  • Aha, D.W. Kibler, and Albert, M. K. Instance based learning algorithms. Machine Learning, 6: 37–66, 1991.

    Google Scholar 

  • Allen, D. The relationship between variable selection and data augmentation and a method for prediction. Technometrics, 16: 125–127, 1974.

    Article  MATH  MathSciNet  Google Scholar 

  • Almuallim, H. and Dietterich, T. G. Efficient algorithms for identifying relevant features. In.Proceedings of the Ninth Canadian Conference on Artificial Intelligence, pages 38–45. Morgan Kaufmann, 1992.

    Google Scholar 

  • Almuallim, H. and Dietterich, T. G. Learning with many irrelevant features. In Proceedings of the Ninth National Conference on Artificial Intelligence, pages 547–542. MIT Press, 1991.

    Google Scholar 

  • Arbel, R. and Rokach, L., Classifier evaluation under limited resources, Pattern Recognition Letters, 27(14): 1619–1631, 2006, Elsevier.

    Article  Google Scholar 

  • Averbuch, M. and Karson, T. and Ben-Ami, B. and Maimon, O. and Rokach, L., Contextsensitive medical information retrieval, The 11th World Congress on Medical Informatics (MEDINFO 2004), San Francisco, CA, September 2004, IOS Press, pp. 282–286.

    Google Scholar 

  • Blum P. and Langley, P. Selection Of Relevant Features And Examples In Machine Learning, Artificial Intelligence, 1997;97: 245-271

    Article  MATH  MathSciNet  Google Scholar 

  • Cardie, C. Using decision trees to improve cased- based learning. In Proceedings of the First International Conference on Knowledge Discovery and Data Mining. AAAI Press, 1995.

    Google Scholar 

  • Caruana, R. and Freitag, D. Greedy attribute selection. In Machine Learning: Proceedings of the Eleventh International Conference. Morgan Kaufmann, 1994.

    Google Scholar 

  • Cherkauer, K. J. and Shavlik, J. W. Growing simpler decision trees to facilitate knowledge discovery. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining. AAAI Press, 1996.

    Google Scholar 

  • Chizi, B. and Maimon, O. “On Dimensionality Reduction of High Dimensional Data Sets”, In “Frontiers in Artificial Intelligence and Applications”. IOS press, pp. 230-236, 2002.

    Google Scholar 

  • Cohen S., Rokach L., Maimon O., Decision Tree Instance Space Decomposition with Grouped Gain-Ratio, Information Science, Volume 177, Issue 17, pp. 3592-3612, 2007.

    Article  Google Scholar 

  • Domingos, P. Context- sensitive feature selection for lazy learners. Artificial Intelligence Review, (11): 227–253, 1997.

    Google Scholar 

  • Elder, J.F. and Pregibon, D. “A Statistical perspective on knowledge discovery in databases” In Advances in Knowledge Discovery and Data Mining, Fayyad, U. Piatetsky-Shapiro, G. Smyth, P. & Uthurusamy, R. ed., AAAI/MIT Press., 1996.

    Google Scholar 

  • George, E. and Foster. D. Empirical Bayes Variable Selection, Biometrika, 2000.

    Google Scholar 

  • Hall, M. Correlation- based feature selection for machine learning, Ph.D. Thesis, Department of Computer Science, University of Waikato, 1999.

    Google Scholar 

  • Holmes, G. and Nevill- Manning, C. G. . Feature selection via the discovery of simple classification rules. In Proceedings of the Symposium on Intelligent Data Analysis, Baden-Baden, Germany, 1995.

    Google Scholar 

  • Holte, R. C. Very simple classification rules perform well on most commonly used datasets. Machine Learning, 11: 63–91, 1993.

    Article  MATH  Google Scholar 

  • Jackson, J. A User’s Guide to Principal Components. New York: JohnWiley and Sons, 1991

    Google Scholar 

  • John, G. H. Kohavi, R. and Pfleger, P. Irrelevant features and the subset selection problem. In Machine Learning: Proceedings of the Eleventh International Conference. Morgan Kaufmann, 1994.

    Google Scholar 

  • Jolliffe, I. Principal Component Analysis. Springer-Verlag, 1986

    Google Scholar 

  • Kira, K. and Rendell, L. A.. A practical approach to feature selection. In Machine Learning: Proceedings of the Ninth International Conference, 1992.

    Google Scholar 

  • Kohavi R. and John, G. Wrappers for feature subset selection. Artificial Intelligence, special issue on relevance, 97(1–2): 273–324, 1996

    Google Scholar 

  • Kohavi, R. and Sommerfield, D. Feature subset selection using the wrapper method: Overfitting and dynamic search space topology. In Proceedings of the First International Conference on Knowledge Discovery and Data Mining. AAAI Press, 1995.

    Google Scholar 

  • Kohavi, R. Wrappers for Performance Enhancement and Oblivious Decision Graphs. PhD thesis, Stanford University, 1995.

    Google Scholar 

  • Koller, D. and Sahami, M. Towards optimal feature selection. In Machine Learning: Proceedings of the Thirteenth International Conference on machine Learning. Morgan Kaufmann, 1996.

    Google Scholar 

  • Kononenko, I. Estimating attributes: Analysis and extensions of relief. In Proceedings of the European Conference on Machine Learning, 1994.

    Google Scholar 

  • Langley, P. Selection of relevant features in machine learning. In Proceedings of the AAAI Fall Symposium on Relevance. AAAI Press, 1994.

    Google Scholar 

  • Langley, P. and Sage, S. Scaling to domains with irrelevant features. In R. Greiner, editor, Computational Learning Theory and Natural Learning Systems, volume 4. MIT Press, 1994.

    Google Scholar 

  • Liu, H. and Setiono, R. A probabilistic approach to feature selection: A filter solution. In Machine Learning: Proceedings of the Thirteenth International Conference on Machine Learning. Morgan Kaufmann, 1996.

    Google Scholar 

  • Maimon O., and Rokach, L. Data Mining by Attribute Decomposition with semiconductors manufacturing case study, in Data Mining for Design and Manufacturing: Methods and Applications, D. Braha (ed.), Kluwer Academic Publishers, pp. 311–336, 2001.

    Google Scholar 

  • Maimon O. and Rokach L., “Improving supervised learning by feature decomposition”, Proceedings of the Second International Symposium on Foundations of Information and Knowledge Systems, Lecture Notes in Computer Science, Springer, pp. 178-196, 2002.

    Google Scholar 

  • Maimon, O. and Rokach, L., Decomposition Methodology for Knowledge Discovery and Data Mining: Theory and Applications, Series in Machine Perception and Artificial Intelligence - Vol. 61, World Scientific Publishing, ISBN:981-256-079-3, 2005.

    Google Scholar 

  • Mallows, C. L. Some comments on Cp . Technometrics 15, 661- 676, 1973

    Article  MATH  Google Scholar 

  • Michalski, R. S. A theory and methodology of inductive learning. Artificial Intelligence, 20(2): 111–161, 1983.

    Article  MathSciNet  Google Scholar 

  • Moore, A. W. and Lee, M. S. Efficient algorithms for minimizing cross validation error. In Machine Learning: Proceedings of the Eleventh International Conference. Morgan Kaufmann, 1994.

    Google Scholar 

  • Moore, A. W. Hill, D. J. and Johnson, M. P. An empirical investigation of brute force to choose features, smoothers and function approximations. In S. Hanson, S. Judd, and T. Petsche, editors, Computational Learning Theory and Natural Learning Systems, volume 3. MIT Press, 1992.

    Google Scholar 

  • Moskovitch R, Elovici Y, Rokach L, Detection of unknown computer worms based on behavioral classification of the host, Computational Statistics and Data Analysis, 52(9):4544–4566, 2008.

    Article  MATH  MathSciNet  Google Scholar 

  • Pazzani, M. Searching for dependencies in Bayesian classifiers. In Proceedings of the Fifth International Workshop on AI and Statistics, 1995.

    Google Scholar 

  • Pfahringer, B. Compression- based feature subset selection. In Proceeding of the IJCAI- 95 Workshop on Data Engineering for Inductive Learning, pages 109–119, 1995.

    Google Scholar 

  • Provan, G. M. and Singh, M. Learning Bayesian networks using feature selection. In D. Fisher and H. Lenz, editors, Learning from Data, Lecture Notes in Statistics, pages 291–300. Springer- Verlag, New York, 1996.

    Google Scholar 

  • Quinlan, J.R. C4.5: Programs for machine learning. Morgan Kaufmann, Los Altos, California, 1993.

    Google Scholar 

  • Quinlan, J.R. Induction of decision trees. Machine Learning, 1: 81–106, 1986.

    Google Scholar 

  • Rissanen, J. Modeling by shortest data description. Automatica, 14: 465–471, 1978. 100 Barak Chizi and Oded Maimon

    Google Scholar 

  • Rokach, L., Decomposition methodology for classification tasks: a meta decomposer framework, Pattern Analysis and Applications, 9 2006):257–271.

    Article  MathSciNet  Google Scholar 

  • Rokach L., Genetic algorithm-based feature set partitioning for classification problems, Pattern Recognition, 41(5):1676–1700, 2008.

    Article  MATH  Google Scholar 

  • Rokach L., Mining manufacturing data using genetic algorithm-based feature set decomposition, Int. J. Intelligent Systems Technologies and Applications, 4(1):57-78, 2008.

    Article  Google Scholar 

  • Rokach, L. and Maimon, O., Theory and applications of attribute decomposition, IEEE International Conference on Data Mining, IEEE Computer Society Press, pp. 473–480, 2001.

    Google Scholar 

  • Rokach L. and Maimon O., Feature Set Decomposition for Decision Trees, Journal of Intelligent Data Analysis, Volume 9, Number 2, 2005b, pp 131–158.

    Google Scholar 

  • Rokach, L. and Maimon, O., Clustering methods, Data Mining and Knowledge Discovery Handbook, pp. 321–352, 2005, Springer.

    Google Scholar 

  • Rokach, L. and Maimon, O., Data mining for improving the quality of manufacturing: a feature set decomposition approach, Journal of Intelligent Manufacturing, 17(3):285–299, 2006, Springer.

    Article  Google Scholar 

  • Rokach, L., Maimon, O., Data Mining with Decision Trees: Theory and Applications,World Scientific Publishing, 2008.

    Google Scholar 

  • Rokach L., Maimon O. and Lavi I., Space Decomposition In Data Mining: A Clustering Approach, Proceedings of the 14th International Symposium On Methodologies For Intelligent Systems, Maebashi, Japan, Lecture Notes in Computer Science, Springer-Verlag, 2003, pp. 24–31.

    Google Scholar 

  • Rokach, L. and Maimon, O. and Averbuch, M., Information Retrieval System for Medical Narrative Reports, Lecture Notes in Artificial intelligence 3055, page 217-228 Springer- Verlag, 2004.

    Google Scholar 

  • Rokach, L. and Maimon, O. and Arbel, R., Selective voting-getting more for less in sensor fusion, International Journal of Pattern Recognition and Artificial Intelligence 20 (3) (2006), pp. 329–350.

    Article  Google Scholar 

  • Scherf, M. and Brauer,W. Feature selection by means of a feature weighting approach. Technical Report FKI- 221- 97, Technische Universit at Munchen 1997.

    Google Scholar 

  • Setiono, R. and Liu, H. Chi2: Feature selection and discretization of numeric attributes. In Proceedings of the Seventh IEEE International Conference on Tools with Artificial Intelligence, 1995

    Google Scholar 

  • Singh, M. and Provan, G. M. Efficient learning of selective Bayesian classifiers. In Machine Learning: Proceedings of the Thirteenth International network Conference on Machine Learning. Morgan Kaufmann, 1996.

    Google Scholar 

  • Skalak, B. Prototype and feature selection by sampling and random mutation hill climbing algorithms. In Machine Learning: Proceedings of the Eleventh International Conference. Morgan Kaufmann, 1994.

    Google Scholar 

  • Vafaie, H. and De Jong, K. Genetic algorithms as a tool for restructuring feature space representations. In Proceedings of the International Conference on Tools with A. I. IEEE Computer Society Press, 1995.

    Google Scholar 

  • Ward, B., What’s Wrong with Economics. New York: Basic Books, 1972.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Chizi, B., Maimon, O. (2009). Dimension Reduction and Feature Selection. In: Maimon, O., Rokach, L. (eds) Data Mining and Knowledge Discovery Handbook. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-09823-4_5

Download citation

  • DOI: https://doi.org/10.1007/978-0-387-09823-4_5

  • Published:

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-387-09822-7

  • Online ISBN: 978-0-387-09823-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics