## Abstract

The distinction between classification and clustering is often based on a priori knowledge of classification labels. However, in the purely theoretical situation where a data-generating model is known, the optimal solutions for clustering do not necessarily correspond to optimal solutions for classification. Exploring this divergence leads us to conclude that no standard measures of either internal or external validation can guarantee a correspondence with optimal clustering performance. We provide recommendations for the suboptimal evaluation of clustering performance. Such suboptimal approaches can provide valuable insight to researchers hoping to add a post hoc interpretation to their clusters. Indices based on pairwise linkage provide the clearest probabilistic interpretation, while a triplet-based index yields information on higher level structures in the data. Finally, a graphical examination of receiver operating characteristics generated from hierarchical clustering dendrograms can convey information that would be lost in any one number summary.

## Keywords

Classification Clustering Sensitivity Specificity Triplet index Hierarchical receiver operating characteristic## Notes

### Acknowledgments

The authors thank the National Cancer Institute for supporting this research through the training grant “Biostatistics for Research in Genomics and Cancer,” NCI grant 5T32CA106209-07 (T32), and the National Institute of Environmental Health Sciences for supporting it through the training grant T32ES007018.

## References

- Aidos, H., Duin, R., Fred, A. (2013). The area under the ROC curve as a criterion for clustering evaluation. In
*ICPRAM 2013 - proceedings of the 2nd international conference on pattern recognition applications and methods*(pp. 276–280).Google Scholar - Albatineh, A.N., Niewiadomska-Bugaj, M., Mihalko, D. (2006). On similarity indices and correction for chance agreement.
*Journal of Classification*,*23*, 301–313.MathSciNetCrossRefzbMATHGoogle Scholar - Arbelaitz, O., Gurrutxaga, I., Muguerza, J., Pérez, J.M., Perona, I. (2013). An extensive comparative study of cluster validity indices.
*Pattern Recognition*,*46*, 243–256.CrossRefGoogle Scholar - Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., Harris, M.A., Hill, D.P., Issel-Tarver, L., Kasarskis, A., Lewis, S., Matese, J.C., Richardson, J.E., Ringwald, M., Rubin, G.M., Sherlock, G. (2000). Gene ontology: tool for the unification of biology.
*Nature Genetics*,*25*(1), 25–29.CrossRefGoogle Scholar - Baulieu, F. (1997). Two variant axiom systems for presence/absence based dissimilarity coefficients.
*Journal of Classification*,*14*(1), 159–170.MathSciNetCrossRefzbMATHGoogle Scholar - Baulieu, F.B. (1989). A classification of presence/absence based dissimilarity coefficients.
*Journal of Classification*,*6*(1), 233–246.MathSciNetCrossRefzbMATHGoogle Scholar - Bhattacharjee, A., Richards, W.G., Staunton, J., Li, C., Monti, S., Vasa, P., Ladd, C., Beheshti, J., Bueno, R., Gillette, M., Loda, M., Weber, G., Mark, E.J., Lander, E.S., Wong, W., Johnson, B.E., Golub, T.R., Sugarbaker, D.J., Meyerson, M. (2001). Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses.
*Proceedings of the National Academy of Sciences of the United States of America*,*98*, 13790–13795.CrossRefGoogle Scholar - Brun, M., Sima, C., Hua, J., Lowey, J., Carroll, B., Suh, E., Dougherty, E.R. (2007). Model-based evaluation of clustering validation measures.
*Pattern Recognition*,*40*(3), 807–824.CrossRefzbMATHGoogle Scholar - Daws, J.T. (1996). The analysis of free-sorting data: beyond pairwise cooccurrences.
*Journal of Classification*,*13*(1), 57–80.CrossRefzbMATHGoogle Scholar - Dougherty, E.R., & Brun, M. (2004). A probabilistic theory of clustering.
*Pattern Recognition*,*37*(5), 917–925.CrossRefzbMATHGoogle Scholar - Gower, J.C., & Legendre, P. (1986). Metric and Euclidean properties of dissimilarity coefficients.
*Journal of Classification*,*3*(1), 5–48.MathSciNetCrossRefzbMATHGoogle Scholar - Handl, J., Knowles, J., Kell, D.B. (2005). Computational cluster validation in post-genomic data analysis.
*Bioinformatics*,*21*(15), 3201–3212.CrossRefGoogle Scholar - Hennig, C. (2015). What are the true clusters?
*Pattern Recognition Letters*,*64*, 53–62.CrossRefzbMATHGoogle Scholar - Hennig, C., & Liao, T.F. (2013). How to find an appropriate clustering for mixed-type variables with application to socio-economic stratification.
*Journal of the Royal Statistical Society: Series C (Applied Statistics)*,*62*(3), 309–369.MathSciNetCrossRefGoogle Scholar - Hoshida, Y., Brunet, J.P., Tamayo, P., Golub, T.R., Mesirov, J.P. (2007). Subclass mapping: identifying common subtypes in independent disease data sets.
*PLoS ONE*,*2*(11), e1195.CrossRefGoogle Scholar - Hubalek, Z. (1982). Coefficients of association and similarity, based on binary (presence-absence) data: an evaluation.
*Biological Reviews*,*57*(4), 669–689.CrossRefGoogle Scholar - Hubert, L., & Arabie, P. (1985). Comparing partitions.
*Journal of Classification*,*2*, 193–218.CrossRefzbMATHGoogle Scholar - Jain, A.K. (2010). Data clustering: 50 years beyond k-means.
*Pattern Recognition Letters*,*31*(8), 651–666.CrossRefGoogle Scholar - Kaufman, L., & Rousseeuw, P.J. (Eds.). (2005).
*Finding groups in data: an introduction to cluster analysis. Wiley series in probability and statistics*. Hoboken: Wiley.Google Scholar - McLachlan, G.J., & Basford, K.E. (1987).
*Mixture models: inference and applications to clustering*. New York: Taylor & Francis.zbMATHGoogle Scholar - Olsen, J.V., Vermeulen, M., Santamaria, A., Kumar, C., Miller, M.L., Jensen, L.J., Gnad, F., Cox, J., Jensen, T.S., Nigg, E.A., Brunak, S., Mann, M. (2010). Quantitative phosphoproteomics reveals widespread full phosphorylation site occupancy during mitosis.
*Science Signaling*,*3*(104), ra3–ra3.CrossRefGoogle Scholar - Qaqish, B.F., O’Brien, J.J., Hibbard, J.C., Clowers, K.J. (2017). Gene expression accelerating high-dimensional clustering with lossless data reduction.
*Bioinformatics*,*33*(18), 2867–2872.CrossRefGoogle Scholar - Rand, W.M. (1971). Objective criteria for the evaluation of clustering methods.
*Journal of the American Statistical Association*,*66*, 846–850.CrossRefGoogle Scholar - Rezaei, M., & Franti, P. (2016). Set matching measures for external cluster validity.
*IEEE Transactions on Knowledge and Data Engineering*,*28*(8), 2173–2186.CrossRefGoogle Scholar - Seber, G.A.F. (2009).
*Multivariate observations*. New York: Wiley.zbMATHGoogle Scholar - Sing, T., Sander, O., Beerenwinkel, N., Lengauer, T. (2005). ROCR: visualizing classifier performance in R.
*Bioinformatics*,*21*(20), 3940–3941.CrossRefGoogle Scholar - Thalamuthu, A., Mukhopadhyay, I., Zheng, X., Tseng, G.C. (2006). Evaluation and comparison of gene clustering methods in microarray analysis.
*Bioinformatics*,*22*(19), 2405–2412.CrossRefGoogle Scholar - Tibshirani, R., Hastie, T., Narasimhan, B., Soltys, S., Shi, G., Koong, A., Le, Q.T. (2004). Sample classification from protein mass spectrometry, by ‘Peak Probability Contrasts’.
*Bioinformatics*,*20*, 3034–3044.CrossRefGoogle Scholar - Tibshirani, R., Walther, G., Hastie, T. (2001). Estimating the number of clusters in a data set via the gap statistic.
*Journal of the Royal Statistical Society: Series B (Statistical Methodology)*,*63*(2), 411–423.MathSciNetCrossRefzbMATHGoogle Scholar - Warrens, M.J. (2008a). On association coefficients for 2 × 2 tables and properties that do not depend on the marginal distributions.
*Psychometrika*,*73*(4), 777–789.MathSciNetCrossRefzbMATHGoogle Scholar - Warrens, M.J. (2008b). On the equivalence of cohen’s kappa and the Hubert-Arabie adjusted rand index.
*Journal of Classification*,*25*(2), 177–183.MathSciNetCrossRefzbMATHGoogle Scholar - Xuan Vinh, N, Julien Epps, U., Bailey, J. (2010). Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance.
*Journal of Machine Learning Research*,*11*, 2837–2854.MathSciNetzbMATHGoogle Scholar