Abstract
A cluster analysis is used to identify groups of objects that are “similar.” This chapter explains the general procedure for determining clusters of similar objects. To do so, measures of similarity or dissimilarity are outlined.
After finishing this chapter, the reader is able to …
-
1.
evaluate data using more complex statistical techniques such as cluster analysis,
-
2.
explain the difference between several approaches to deal with large datasets in cluster analysis by using TwoStep or K-Means algorithm,
-
3.
describe the advantages and the pitfalls of the cluster analysis methods,
-
4.
apply TwoStep or K-Means and explain the results as well as
-
5.
describe the usage of the Auto Clustering node of the IBM SPSS Modeler and its pitfalls.
Ultimately, the reader will be called upon to propose well thought-out and practical business actions from the statistical results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Literature
Bacher, J., Wenzig, K., & Vogler, M. (2004). SPSS TwoStep Cluster—A first evaluation. Accessed 07/05/2015, from http://www.statisticalinnovations.com/products/twostep.pdf
Backhaus, K. (2011). Multivariate Analysemethoden: Eine anwendungsorientierte Einführung, Springer-Lehrbuch (13th ed.). Berlin: Springer.
Bühl, A. (2012). SPSS 20: Einführung in die moderne Datenanalyse, Scientific tools (13th ed.). München: Pearson.
Ding, C., & He, X. (2004). K-means Clustering via Principal Component Analysis. Accessed 18/05/2015, from http://ranger.uta.edu/~chqding/papers/KmeansPCA1.pdf
Handl, A. (2010). Multivariate Analysemethoden: Theorie und Praxis multivariater Verfahren unter besonderer Berücksichtigung von S-PLUS, Statistik und ihre Anwendungen (2nd ed.). Heidelberg: Springer.
IBM. (2015a). SPSS Modeler 17 Algorithms Guide. Accessed 18/09/2015, from ftp://public.dhe.ibm.com/software/analytics/spss/documentation/modeler/17.0/en/AlgorithmsGuide.pdf
IBM. (2015b). SPSS Modeler 17 Modeling Nodes. Accessed 18/09/2015, from ftp://public.dhe.ibm.com/software/analytics/spss/documentation/modeler/17.0/en/ModelerModelingNodes.pdf
IBM. (2015c). SPSS Modeler 17 Source, Process, and Output Nodes. Accessed 19/03/2015, from ftp://public.dhe.ibm.com/software/analytics/spss/documentation/modeler/17.0/en/ModelerSPOnodes.pdf
IBM Website. (2014). Customer segmentation analytics with IBM SPSS. Accessed 08/05/2015, from http://www.ibm.com/developerworks/library/ba-spss-pds-db2luw/index.html
Kohonen, T. (2001). Self-Organizing Maps, Springer Series in Information Sciences, Vol. 30, 3rd ed. Berlin: Springer.
Mardia, K. V., Kent, J. T., & Bibby, J. M. (1979). Multivariate analysis.
Murty, M. N., & Devi, V. S. (2011). Pattern recognition: An algorithmic approach, Undergraduate topics in computer science. London, New York: Springer, Universities Press (India) Pvt. Ltd.
Struyf, A., Hubert, M., & Rousseeuw, P. J. (1997). Integrating robust clustering techniques in S-PLUS.
Tavana, M. (2013). Management theories and strategic practices for decision making. Hershey, PA: Information Science Reference.
Timm, N. H. (2002). Applied multivariate analysis, Springer texts in statistics. New York: Springer.
Vogt, W. P., Vogt, E. R., Gardner, D. C., & Haeffele, L. M. (2014). Selecting the right analyses for your data: Quantitative, qualitative, and mixed methods.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Wendler, T., Gröttrup, S. (2016). Cluster Analysis. In: Data Mining with SPSS Modeler. Springer, Cham. https://doi.org/10.1007/978-3-319-28709-6_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-28709-6_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-28707-2
Online ISBN: 978-3-319-28709-6
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)