The Progressive Single Linkage Algorithm Based on Minkowski Ultrametrics

  • Sergio ScippacercolaEmail author
Conference paper
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)


This paper focuses on the problem to find an ultrametric whose distortion is close to optimal. We introduce the Minkowski ultrametric distances of the n statistical units obtained by a hierarchical Cluster method (single linkage). We consider the distortion matrix which measures the difference between the initial dissimilarity and the ultrametric approximation. We propose an algorithm which by the application of the Minkowski ultrametrics reaches a minimum approximation. The convergence of the algorithm allows us to identify when the ultrametric approximation is at the local minimum. The validity of the algorithm is confirmed by its application to sets of real data.


  1. Anderson, E. (1935). The irises of the Gaspé peninsula. Bulletin of the American Iris Society, 59, 2–5.Google Scholar
  2. Bădoiu, M., Chuzhoy, J., Indyk, P., & Sidiropoulos, A. (2006). Embedding ultrametrics into low-dimensional spaces. In Proceedings of twenty-second annual symposium on Computational Geometry SCG’06 (pp. 187–196), Sedona, AZ: ACM Press.CrossRefGoogle Scholar
  3. Bock, H. H. (1996). Probabilistic models in cluster analysis. Computational Statistics and Data Analysis, 23(1), 6–28.CrossRefGoogle Scholar
  4. Borg, I., & Lingoes, J. (1987). Multidimensional similarity structure analysis. Berlin: Springer.Google Scholar
  5. Chandon, J. L., Lemaire, J., & Pouget, J. (1980). Construction de l’ultrametrique la plus proche d’une dissimilarité au sens des moindres carrés. R.A.I.R.O. Recherche Operationelle, 14, 157–170.Google Scholar
  6. De Soëte, G. (1988). Tree representations of proximity data by least squares methods. In H. H. Bock (Ed.), Classification and related methods of data analysis (pp. 147–156). Amsterdam: North Holland.Google Scholar
  7. Eurostat. (n.d.). General and regional statistics.
  8. Gordon, A. D. (1996). A survey of constrained classification. Computational Statistics and Data Analysis, 21(1), 17–29.zbMATHCrossRefMathSciNetGoogle Scholar
  9. Gower, J. C., & Ross, J. S. (1969). Minimum spanning trees and single linkage cluster analysis. Applied Statistics, 18, 54–64.CrossRefMathSciNetGoogle Scholar
  10. Hardy, G. H., Littlewood, J. E., & Polya, G. (1964). Inequalities. Cambridge: Cambridge University Press.Google Scholar
  11. Jain, A. K., Murty, M. N., & Flynn, P. J. (1999). Data clustering: A Review. ACM Computing Survey, 31(3), 264–323.CrossRefGoogle Scholar
  12. Kruskal, J. B. (1956). On the shortest spanning subtree of a graph and the traveling salesman problem. Proceedings of the Mathematical Society, 7, 48–50.CrossRefMathSciNetGoogle Scholar
  13. Mardia, K. V., Kent, J. T., & Bibby, J. M. (1989). Multivariate analysis. New York: Academic.Google Scholar
  14. Prim, R. C. (1957). Shortest connection network and some generalizations. Bell System Technical Journal, 36, 1389–1401.Google Scholar
  15. Rizzi, A. (1985). Analisi dei dati. Rome: La Nuova Italia Scientifica.Google Scholar
  16. Scippacercola, S. (2003). Evaluation of clusters stability based on minkowski ultrametrics. Statistica Applicata – Italian Journal of Applied Statistics, 15(4), 483–489.Google Scholar
  17. Scozzafava, P. (1995). Ultrametric spaces in statistics. In A. Rizzi (Ed.), Some relations between matrices and structures of multidimensional data analysis. Pisa: Giardini.Google Scholar
  18. Sebert, D. M., Montgomery, D. C., & Rollier, D. A. (1998). A clustering algorithm for identifying multiple outliers in linear regression. Computational Statistics and Data Analysis, 27(4), 461–484.zbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  1. 1.Dipartimento di Matematica e StatisticaUniversità degli studi di Napoli Federico IINapoliItaly

Personalised recommendations