Skip to main content

Statistical and Methodological Considerations When Using Cluster Analysis in Neuropsychological Research

  • Chapter
  • First Online:
Book cover Cluster Analysis in Neuropsychological Research

Abstract

Multivariate classification of variables into mathematically definable and homogenous subsets is often a useful first step in pattern recognition prior to formal statistical analyses of data sets. One such methodology, cluster analysis, has the main goal of clustering entities that share common characteristics and data structure. For example, one goal of such an analysis is to gain insight into the variables that are important in determining group membership so that new data can be easily classified; additionally, one may wish to develop subsets of data that share certain characteristics to facilitate statistical analysis of variables that are hypothesized to be related to clustered entities. As such, cluster analysis can be useful when applied to neuropsychological variables, particularly when an empirical statistical approach to classification is desirable or when significant interindividual differences in neuropsychological function exist within clinical populations. The goal of this chapter is to provide a review of clustering methods, including hierarchical agglomerative methods and iterative partitioning methods. Recommendations for determining the appropriate number of clusters and for comparing clustering methods also will be discussed. Further, validation techniques will be addressed. The chapter will conclude with a discussion of data issues commonly encountered in neuropsychological research, such as non-normality of data and incomplete data records from patients, and techniques for handling these situations. Finally, a table outlining various statistical packages is provided for the reader interested in the availability of software for computation.

The views presented in this chapter are those of the author(s) and do not necessarily represent the views of the US Department of Veterans Affairs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 74.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Allen, D. N., Goldstein, G., & Warnick, E. (2003). A consideration of neuropsychologically normal schizophrenia. Journal of the International Neuropsychological Society, 9, 56–63.

    Article  PubMed  Google Scholar 

  • Allen, D. N., Leany, B. D., Thaler, N. S., Cross, C., Sutton, G. P., & Mayfield, J. (2010). Memory and attention profiles in pediatric traumatic brain injury. Archives of Clinical Neuropsychology, 25, 618–633.

    Article  PubMed  Google Scholar 

  • Bacher, J., Wenzig, K., & Vogler, M. (2004). SPSS TwoStep clustera first evaluation. Retrieved February 15, 2008, from http://www.statisticalinnovations.com/products/twostep.pdf

  • Beale, E. M. L. (1969). Euclidean cluster analysis. Bulletin of the International Statistical Institute: Proceedings of the 37th Session (London), Book 2 (pp. 92–94). Voorburg, The Netherlands: ISI.

    Google Scholar 

  • Betz, N. E. (1987). Use of discriminant analysis in counseling psychology research. Journal of Counseling Psychology, 34, 393–403.

    Article  Google Scholar 

  • Bulger, D. A., Matthews, R. A., & Hoffman, M. E. (2007). Work and personal life boundary management: Boundary strength, work/personal life balance, and the segmentation-integration continuum. Journal of Occupational Health Psychology, 12, 365–375.

    Article  PubMed  Google Scholar 

  • Burnham, K. P., & Anderson, D. (2002). Model selection and multi-model inference: A practical information-theoretic approach (2nd ed.). New York: Springer.

    Google Scholar 

  • Calinski, R. B., & Harabasz, J. (1974). A dendrite method for cluster analysis. Communications in Statistics, 3, 1–27.

    Google Scholar 

  • Chavent, M., Ding, Y., Fu, L., Stolowy, H., & Wang, H. (2006). Disclosure and determinants studies; an extension using the division clustering method (DIV). European Accounting Review, 15, 181–218.

    Article  Google Scholar 

  • Cross, C. L., & Petersen, C. E. (2001). Modeling snake microhabitat from radiotelemetry studies using polytomous logistic regression. Journal of Herpetology, 35, 590–597.

    Article  Google Scholar 

  • Donoghue, J. R. (1995). Univariate screening measures for cluster analysis. Mutivariate Behavioral Research, 30, 385–427.

    Article  Google Scholar 

  • Efron, B. (1982). The jackknife, the bootstrap and other resampling plans. SIAM CBMS-NSF Monographs, 28.

    Google Scholar 

  • Eisen, M. B., Spellman, P. T., Brown, P. O., & Botstein, D. (1998). Cluster analysis and display of genome-wide expression patterns. Proceedings of the National Academy of Sciences, 95, 14863–14868.

    Article  Google Scholar 

  • Everitt, B. S., Landau, S., Leese, M., & Stahl, D. (2011). Cluster analysis (5th ed.). New York: Wiley.

    Book  Google Scholar 

  • Glasø, L., Matthiesen, S. B., Nielsen, M. B., & Ståle, E. (2007). Do targets of workplace bullying portray a general victim personality profile? Scandinavian Journal of Psychology, 48, 313–319.

    Article  PubMed  Google Scholar 

  • Goldstein, G. (1990). Neuropsychological heterogeneity in schizophrenia: A consideration of abstraction and problem-solving abilities. Archives of Clinical Neuropsychology, 5, 251–264.

    PubMed  Google Scholar 

  • Gower, J. C. (1971). A general coefficient of similarity and some of its properties. Biometrics, 27, 857–872.

    Article  Google Scholar 

  • Gower, J. C., & Legendre, P. (1986). Metric and Euclidean properties of dissimilarity coefficients. Journal of Classification, 5, 5–48.

    Article  Google Scholar 

  • Halsell, J. N. (2007). Using cluster analysis to evaluate the academic performance of demographic homogeneous subsets. Unpublished doctoral dissertation, University of Nevada, Las Vegas, Nevada.

    Google Scholar 

  • Heinrichs, R. W., & Awad, A. G. (1993). Neurocognitive subtypes of chronic schizophrenia. Schizophrenia Research, 9, 49–58.

    Article  PubMed  Google Scholar 

  • Hill, S. K., Ragland, J. D., Gur, R. C., & Gur, R. E. (2002). Neuropsychological profiles delineate distinct profiles of schizophrenia, an interaction between memory and executive function, and uneven distribution of clinical subtypes. Journal of Clinical and Experimental Neuropsychology, 24, 2002.

    Article  Google Scholar 

  • Hosmer, D. W., & Lemeshow, S. (2001). Applied logistic regression (2nd ed.). New York: Wiley.

    Google Scholar 

  • Huff, D. (1954). How to lie with statistics. New York: W. W. Norton.

    Google Scholar 

  • Ichino, M., & Yaguchi, H. (1994). Generalized Minkowski metrics for mixed feature-type data analysis. IEEE Transactions on Systems, Man and Cybernetics, 24, 698–708.

    Article  Google Scholar 

  • Jiang, D., Tang, C., & Zhang, A. (2004). Cluster analysis for gene expression data: A survey. IEEE Transactions on Knowledge and Data Engineering, 16, 1370–1386.

    Article  Google Scholar 

  • Johnson, R. A., & Wichern, D. W. (2007). Applied multivariate statistical analysis (6th ed.). Upper Saddle River, NJ: Pearson.

    Google Scholar 

  • Lance, G. N., & Williams, W. T. (1967). A general theory of classification sorting strategies: 1. Hierarchical systems. Computer Journal, 9, 373–380.

    Article  Google Scholar 

  • Libon, D. J., Schwartzman, R. J., Eppig, J., Wambach, D., Brahin, E., Peterlin, B. L., et al. (2010). Neuropsychological deficits associated with complex regional pain syndrome. Journal of the International Neuropsychological Society, 16, 566–573.

    Article  PubMed  Google Scholar 

  • Lumley, T. (2001). Orca [R [RJava]]. Proceedings of the 2nd International Workshop on Distributed Statistical Computing, Vienna, Austria. Available online at http://www.ci.tuwien.ac.at/Conferences/DSC-2001/Proceedings/Lumley.pdf

  • Milligan, G. W., & Cooper, M. C. (1985). An examination of procedures for determining the number of clusters in a data set. Psychometrika, 50, 159–179.

    Article  Google Scholar 

  • Morris, R., Blashfield, R., & Satz, P. (1981). Neuropsychology and cluster analysis: Potentials and problems. Journal of Clinical Neuropsychology, 3, 79–99.

    Article  PubMed  Google Scholar 

  • Myers, R. E., III, & Fouts, J. T. (1992). A cluster analysis of high school science classroom environment and attitude toward science. Journal of Research in Science Teaching, 29, 929–937.

    Article  Google Scholar 

  • Palmer, B. W., Dawes, S. W., & Heaton, R. K. (2009). What do we know about neuropsychological aspects of schizophrenia? Neuropsychology Review, 19, 365–384.

    Article  PubMed  Google Scholar 

  • Pearson, K. (1920). Notes on the history of correlation. Biometrika, 13, 25–45.

    Article  Google Scholar 

  • Peters, K. R., Graf, P., Hayden, S., & Feldman, H. (2005). Neuropsychological subgroups of cognitively-­impaired-not-demented (CIND) individuals: Delineation, reliability, and predictive validity. Journal of Clinical and Experimental Neuropsychology, 27, 164–188.

    Article  PubMed  Google Scholar 

  • Punj, G., & Stewart, D. W. (1983). Cluster analysis in marketing research: Review and suggestions for application. Journal of Marketing Research, 20, 134–148.

    Article  Google Scholar 

  • Rodgers, J. L., & Nicewander, A. (1988). Thirteen ways to look at the correlation coefficient. The American Statistician, 42, 59–66.

    Article  Google Scholar 

  • Rogers, T. T., Ralph, M. A. L., Garrard, P., Bozeat, S., McClelland, J. L., Hodges, J. R., et al. (2004). Structure and deterioration of semantic memory: A neuropsychological and computational investigation. Psychological Review, 111, 205–235.

    Article  PubMed  Google Scholar 

  • Sarkar, D. (2008). Lattice: Multivariate visualization with R. New York: Springer.

    Google Scholar 

  • Sarle, W. S. (1983). The cubic cluster criterion. SAS Technical Report A-108. Cary, NC: SAS Institute.

    Google Scholar 

  • Seaton, B. E., Goldstein, G., & Allen, D. (2001). Sources of heterogeneity in schizophrenia: The role of neuropsychological functioning. Neuropsychological Review, 11, 45–67.

    Article  Google Scholar 

  • Sharp, H. (1968). Cardinality of finite topologies. Journal of Combinatorial Theory, 5, 82–86.

    Article  Google Scholar 

  • Tabachnick, B. G., & Fidell, L. S. (2007). Using multivariate statistics, 5th ed. Boston, MA: Allyn and Bacon.

    Article  Google Scholar 

  • Tajima, F. (1993). Unbiased estimation of evolutionary distance between nucleotide sequences. Molecular Biology and Evolution, 10, 677–688.

    PubMed  Google Scholar 

  • Tamura, K., Peterson, D., Peterson, N., Stecher, G., Nei, M., & Kumar, S. (2011). MEGA5: Molecular evolutionary genetics analysis using maximum likelihood, evolutionary distances, and maximum parsimony methods. Molecular Biology and Evolution, 28, 2731–2739.

    Article  PubMed  Google Scholar 

  • Thaler, N. S., Bellow, D. T., Randall, C., Goldstein, G., Mayfield, J., & Allen, D. N. (2010). IQ profiles are associated with differences in behavioral functioning following pediatric traumatic brain injury. Archives of Clinical Neuropsychology, 25, 781–790.

    Article  PubMed  Google Scholar 

  • Timm, N. H. (2002). Applied multivariate statistics. New York: Springer.

    Google Scholar 

  • Wallace, L., Keil, M., & Rai, A. (2004). Understanding software project risk: A cluster analysis. Information and Management, 42, 115–125.

    Article  Google Scholar 

  • Ward, J. H. (1963). Hierarchical groupings to optimize an objective function. Journal of the American Statistical Association, 58, 236–244.

    Article  Google Scholar 

  • Wessman, J., Paunio, T., Tuulio-Henriksson, A., Koivisto, M., Partonen, T., Suvisaan, J., et al. (2009). Mixture model clustering of phenotype features reveals evidence for association of DTNBP1 to a specific subtype of schizophrenia. Biological Psychiatry, 66, 990–996.

    Article  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chad L. Cross Ph.D., P.Stat®, L.C.A.D.C., M.F.T. .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media New York

About this chapter

Cite this chapter

Cross, C.L. (2013). Statistical and Methodological Considerations When Using Cluster Analysis in Neuropsychological Research. In: Allen, D., Goldstein, G. (eds) Cluster Analysis in Neuropsychological Research. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-6744-1_2

Download citation

Publish with us

Policies and ethics