Skip to main content

Cluster Analysis

  • Chapter
  • First Online:
Data Mining with SPSS Modeler

Abstract

A cluster analysis is used to identify groups of objects that are “similar.” This chapter explains the general procedure for determining clusters of similar objects. To do so, measures of similarity or dissimilarity are outlined.

After finishing this chapter, the reader is able to …

  1. 1.

    evaluate data using more complex statistical techniques such as cluster analysis,

  2. 2.

    explain the difference between several approaches to deal with large datasets in cluster analysis by using TwoStep or K-Means algorithm,

  3. 3.

    describe the advantages and the pitfalls of the cluster analysis methods,

  4. 4.

    apply TwoStep or K-Means and explain the results as well as

  5. 5.

    describe the usage of the Auto Clustering node of the IBM SPSS Modeler and its pitfalls.

Ultimately, the reader will be called upon to propose well thought-out and practical business actions from the statistical results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Literature

  • Bacher, J., Wenzig, K., & Vogler, M. (2004). SPSS TwoStep Cluster—A first evaluation. Accessed 07/05/2015, from http://www.statisticalinnovations.com/products/twostep.pdf

    Google Scholar 

  • Backhaus, K. (2011). Multivariate Analysemethoden: Eine anwendungsorientierte Einführung, Springer-Lehrbuch (13th ed.). Berlin: Springer.

    Google Scholar 

  • Bühl, A. (2012). SPSS 20: Einführung in die moderne Datenanalyse, Scientific tools (13th ed.). München: Pearson.

    Google Scholar 

  • Ding, C., & He, X. (2004). K-means Clustering via Principal Component Analysis. Accessed 18/05/2015, from http://ranger.uta.edu/~chqding/papers/KmeansPCA1.pdf

    Google Scholar 

  • Handl, A. (2010). Multivariate Analysemethoden: Theorie und Praxis multivariater Verfahren unter besonderer Berücksichtigung von S-PLUS, Statistik und ihre Anwendungen (2nd ed.). Heidelberg: Springer.

    Google Scholar 

  • IBM. (2015a). SPSS Modeler 17 Algorithms Guide. Accessed 18/09/2015, from ftp://public.dhe.ibm.com/software/analytics/spss/documentation/modeler/17.0/en/AlgorithmsGuide.pdf

    Google Scholar 

  • IBM. (2015b). SPSS Modeler 17 Modeling Nodes. Accessed 18/09/2015, from ftp://public.dhe.ibm.com/software/analytics/spss/documentation/modeler/17.0/en/ModelerModelingNodes.pdf

    Google Scholar 

  • IBM. (2015c). SPSS Modeler 17 Source, Process, and Output Nodes. Accessed 19/03/2015, from ftp://public.dhe.ibm.com/software/analytics/spss/documentation/modeler/17.0/en/ModelerSPOnodes.pdf

    Google Scholar 

  • IBM Website. (2014). Customer segmentation analytics with IBM SPSS. Accessed 08/05/2015, from http://www.ibm.com/developerworks/library/ba-spss-pds-db2luw/index.html

    Google Scholar 

  • Kohonen, T. (2001). Self-Organizing Maps, Springer Series in Information Sciences, Vol. 30, 3rd ed. Berlin: Springer.

    Google Scholar 

  • Mardia, K. V., Kent, J. T., & Bibby, J. M. (1979). Multivariate analysis.

    Google Scholar 

  • Murty, M. N., & Devi, V. S. (2011). Pattern recognition: An algorithmic approach, Undergraduate topics in computer science. London, New York: Springer, Universities Press (India) Pvt. Ltd.

    Google Scholar 

  • Struyf, A., Hubert, M., & Rousseeuw, P. J. (1997). Integrating robust clustering techniques in S-PLUS.

    Google Scholar 

  • Tavana, M. (2013). Management theories and strategic practices for decision making. Hershey, PA: Information Science Reference.

    Google Scholar 

  • Timm, N. H. (2002). Applied multivariate analysis, Springer texts in statistics. New York: Springer.

    Google Scholar 

  • Vogt, W. P., Vogt, E. R., Gardner, D. C., & Haeffele, L. M. (2014). Selecting the right analyses for your data: Quantitative, qualitative, and mixed methods.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Wendler, T., Gröttrup, S. (2016). Cluster Analysis. In: Data Mining with SPSS Modeler. Springer, Cham. https://doi.org/10.1007/978-3-319-28709-6_7

Download citation

Publish with us

Policies and ethics