Skip to main content

Application of a Genetic Algorithm to Variable Selection in Fuzzy Clustering

  • Conference paper
Classification — the Ubiquitous Challenge

Abstract

In order to group the observations of a data set into a given number of clusters, an ‘optimal’ subset out of a greater number of explanatory variables is to be selected. The problem is approached by maximizing a quality measure under certain restrictions that are supposed to keep the subset most representative of the whole data. The restrictions may either be set manually, or generated from the data. A genetic optimization algorithm is developed to solve this problem.

The procedure is then applied to a data set describing features of sub-districts of the city of Dortmund, Germany, to detect different social milieus and investigate the variables making up the differences between these.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  • FRALEY, C. and RAFTERY, A.E. (2002): mclust: Software for model-based clustering, density estimation and discriminant analysis. Technical Report, Department of Statistics, University of Washington. See http://www.stat.washington.edu/mclust.

    Google Scholar 

  • GOLDBERG, D.E. (1989): Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley, Boston.

    Google Scholar 

  • HALL, M.A. (1999): Correlation-based feature subset selection for machine learning. PhD thesis, Department of computer science, University of Waikato.

    Google Scholar 

  • IHAKA, R. and GENTLEMAN, R. (1996): R: A language for Data Analysis and Graphics. Journal of Computational and Graphical Statistics, 5(3), 299–314. See also http//:www.r-project.org

    Article  Google Scholar 

  • KAUFMAN, L. and ROUSSEEUW, P.J. (1990): Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, New York.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin · Heidelberg

About this paper

Cite this paper

Röver, C., Szepannek, G. (2005). Application of a Genetic Algorithm to Variable Selection in Fuzzy Clustering. In: Weihs, C., Gaul, W. (eds) Classification — the Ubiquitous Challenge. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-28084-7_80

Download citation

Publish with us

Policies and ethics