DESPOTA: An Algorithm to Detect the Partition in the Extended Hierarchy of a Dendrogram
DESPOTA is a method proposed to seek the best partition among the ones hosted in a dendrogram. The algorithm visits nodes from the tree root toward the leaves. At each node, it tests the null hypothesis that the two descending branches sustain only one cluster of units through a permutation test approach. At the end of the procedure, a partition of the data into clusters is returned. This paper focuses on the interpretation of the test statistic using a data–driven approach, exploiting a real dataset to show the details of the test statistic and the algorithm in action. The working principle of DESPOTA is shown in the light of the Lance–Williams recurrence formula, which embeds all types of agglomeration methods.
- 2.Cormack, R.M.: A review of classification. J, R. Stat. Soc. Ser. A (General) 134(3), 321–367 (1971)Google Scholar
- 6.Gordon, A.D.: Classification, 2nd edn. Chapman & Hall/CRC Press (1999)Google Scholar
- 8.Kaufman, L., Rousseeuw, P.J.: Finding groups in data. In: An Introduction to Cluster Analysis. Wiley. New York (1990)Google Scholar
- 11.Lichman, M.: UCI machine learning repository. Irvine, CA: University of California, School of Information and Computer Science. http://archive.ics.uci.edu/ml (2013)
- 14.Pesarin, F., Salmaso, L.: Permutation tests for complex data. In: Theory, Applications and Software. Wiley, Chichester, UK (2010)Google Scholar
- 15.R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/ (2015)