Skip to main content

Part of the book series: Advanced Information and Knowledge Processing ((AI&KP))

  • 1621 Accesses

Abstract

The basic data consists of a finite set E provided with a dissimilarity or a similarity function. The elements of E are of the same nature. As seen in Chap. 3, E can be a set of attributes, a set of objects or a set of categories. Tables 3.4 and 3.5 of Chap. 3 give a precise idea of the possible different versions of E. On the other hand, the set E may be weighted by a positive numerical measure \(\mu _{E}\), assigning to each of its elements x (\(x \in E\)) a weight \(\mu _{x}\). \(\mu _{x}\) defines the “importance” with which x has to be considered. The dissimilarity or similarity function on E is mostly numerical. However, it might be ordinal. In Chaps. 47, facets of building a similarity index or association coefficient on E have been minutely examined in relation to the descriptive nature of E.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Two total preorders \(\omega \) and \(\omega '\) on a given set are comparable if the graph of one of both total preorders is included in the other one.

References

  1. Abbassi, N.: Identification de familles protéiques. Master 2, Research report, IRISA-INRIA, June 2013

    Google Scholar 

  2. Aczel, J., Forte, B., Ng, C.T.: Why the Shannon and Hartley entropies are “natural”. Adv. Appl. Probab. 6, 131–146, Printed in Israel (1974)

    Google Scholar 

  3. Bachar, K., Lerman, I.-C.: Statistical conditions for an algorithm of hierarchical classification under constraint of contiguiuty. In: Rizzi, A., Bock, H.-H., Vichi, M. (eds.) Advances in Data Science and Classification, pp. 131–136. Springer, Berlin (1998)

    Google Scholar 

  4. Bachar, K., Lerman, I.-C.: Fixing parameters in the constrained hierarchical classification method: application to digital image segmentation. In: Banks, D., et al. (eds.) Classification Clustering and Data Mining Applications, pp. 85–94. Springer, New York (2004)

    Google Scholar 

  5. Benzécri, J.-P.: Construction d’une classification ascendante hiérarchique par la recherche en chaîne des voisins réciproques. Les Cahiers de l’Analyse des Données 7, 209–218 (1982)

    Google Scholar 

  6. Benzécri, J.-P.: L’analyse des données, tomes I et II. Dunod (1973)

    Google Scholar 

  7. Bock, H.H.: Automatische Klassifikation Vandenhoeck und Rupprecht. Gottingen (1974)

    Google Scholar 

  8. Bruynooghe, M.: Classification ascendante hiérarchique des grands ensembles de données; un algorithme rapide fondé sur la construction des voisinages réductibles. Cahiers de l’Analyse des Données III(1), 7–33 (1978)

    Google Scholar 

  9. Bruynooghe, M.: Nouveaux algorithmes en classification automatique applicables aux très grands ensembles de données, rencontrés en traitement d’images et en reconnaissance de formes. Ph.D. thesis, Thèse d’Etat, Université de Paris 6, Jan 1989

    Google Scholar 

  10. Bruynooghe, M.: Recent results in hierarchical clustering. I—the reducible neighborhoods clustering algorithm. Int. J. Pattern Recognit. Artif. Intell. 7(3), 541–571 (1993)

    Google Scholar 

  11. Costa Nicolau, F., Bacelar-Nicolau, H.: Some trends in the classification of variables.In: Hayashi, C., et al. (eds.) Data Science, Classification and Related Methods, pp. 89–98.Springer, New York (1998)

    Google Scholar 

  12. Cutting, D., Karger, D., Pederson, J., Tukey, J.: Scatter/gather: a cluster-based approach to browsing large document collections. In: Belkin, N., et al. (eds.) International ACM SIGIR Conference on Research and Development in Information Retrevial, pp. 318–339. ACM Press (1992)

    Google Scholar 

  13. de Rham, C.: La classification hiérarchique ascendante selon la méthode des voisins réciproques. Les Cahiers de l’Analyse des Données V(2), 135–144 (1980)

    Google Scholar 

  14. Diday, E.: Inversions en classification hiérarchique: application à la construction adaptative d’indices d’agrégation. Revue de Statistique Appliquée 31(1), 45–62 (1983)

    Google Scholar 

  15. Edwards, W.F., Cavalli-Sforza, L.L.: A method for cluster analysis. Biometrics 21, 363–375 (1965)

    Article  Google Scholar 

  16. Feller, W.: An Introduction to Probability Theory and Its Applications, vol. II, 2nd edn. Wiley, New York (1971)

    MATH  Google Scholar 

  17. Florek, K.J., Lukaszewick, J., Perkal, J., Steinhaus, H., Zubrzycki, S.: Sur la liaison et la division des points d’un ensemble fini. In: Colloquium Mathematics, vol. 2, pp. 282–285 (1951)

    Google Scholar 

  18. Florek, K.J., Lukaszewick, J., Perkal, J., Steinhaus, H., Zubrzycki, S.: Taksonomia wroclawska (in polish with english summary). Przegl. antrop. 17, 193–217 (1951)

    Google Scholar 

  19. Ghazzali, N.: Comparaison et réduction d’arbres de classification, en relation avec des problèmes de quantification en imagerie numérique. Ph.D. thesis, Université de Rennes 1, May 1992

    Google Scholar 

  20. Ghazzali, N., Léger, A., Lerman, I.C.: Rôle de la classification statistique dans la compression du signal d’image: Panoram et étude spécifique de cas. La Revue de Modulad 14, 51–91 (1994)

    Google Scholar 

  21. Gordon, A.D.: A review of hierarchical classification. J. R. Stat. Soc. 150(2), 119–137 (1987)

    MathSciNet  MATH  Google Scholar 

  22. Govaert, G.: La classification croisée. La Revue de Modulad 4, 9–36 (1989)

    Google Scholar 

  23. Gras, R., Larher, A.: L’implication statistique, une nouvelle méthode d’analyse des données. Mathématiques, Informatique et Sciences Humaines 120, 5–31 (1993)

    Google Scholar 

  24. Jambu, M.: Classification Automatique pour l’Analyse des Données. Dunod (1978)

    Google Scholar 

  25. Jardine, N., Sibson, R.: Mathematical Taxonomy. Wiley, New York (1971)

    MATH  Google Scholar 

  26. Kojadinovic, I.: Hierarchical clustering of continuous variables based on the empirical copula process and permutation linkages. Comput. Stat. Data Anal. 54, 90–108 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  27. Lance, G.N., Williams, W.T.: A general theory of classificatory sorting strategies. I. Hierarchical systems. Comput. J. 9, 373–380 (1967)

    Article  Google Scholar 

  28. Lee, A., Willcox, B.: Minkowski generalizations of Ward’s method in hierarchical clustering. J. Classif. 31, 194–218 (2014)

    Article  MathSciNet  Google Scholar 

  29. Leredde, H.: La méthode des pôles d’attraction; La méthode des pôles d’agrégation : deux nouvelles familles d’algorithmes en classification automatique et sériation. Ph.D. thesis, Université de Paris 6, Oct 1979

    Google Scholar 

  30. Lerman, I.-C., Bachar, K.: Contruction et justification d’une méthode de classification ascendante hiérarchique accélérée fondée fondée sur le critère de la vraisemblance du lien en cas de données de contiguïté. application en imagerie numérique. Publication Interne 1616, IRISA-INRIA, April 2004

    Google Scholar 

  31. Lerman, I.-C., Bachar, K.: Comparaison de deux critères en classification ascendante hiérarchique sous contrainte de contiguïté. application en imagerie numérique. Journal de la Société Française de Statistique 149(2), 45–74 (2008)

    Google Scholar 

  32. Lerman, I.-C., Peter, Ph.: Analyse d’un algorithme de classification hiérarchique en parallèle pour le traitement de gros ensembles, aspects méthodologiques et programmation. Publication Interne IRISA et Rapport de Recherche INRIA 232, IRISA-INRIA, Aug 1984

    Google Scholar 

  33. Lerman, I.C.: Les bases de la classification automatique. Gauthier-Villars (1970)

    Google Scholar 

  34. Lerman, I.C.: Analyse du phénomène de la “sériation”. Revue mathématique et Sciences Humaines 38 (1972)

    Google Scholar 

  35. Lerman, I.C.: Classification et analyse ordinale des données. Dunod. http://www.brclasssoc.org.uk/books/index.html (1981)

  36. Lerman, I.C.: Formules de réactualisation en cas d’agrégations multiples. Recherche Opérationnele, Operations Research 23(2), 151–163 (1989)

    Google Scholar 

  37. Lerman, I.C.: Foundations of the likelihood linkage analysis (LLA) classification method. Appl. Stoch. Models Data Anal. 7, 63–76 (1991)

    Article  MathSciNet  MATH  Google Scholar 

  38. Lerman, I.C.: Likelihood linkage analysis (LLA) classification method: an example treated by hand. Biochimie 75, 379–397 (1993)

    Article  Google Scholar 

  39. Lerman, I.C.: Analyse logique, combinatoire et statistique de la construction d’une hiérarchie binaire; niveaux et noeuds significatifs. Mathématiques et Sciences Humaines, Mathematics and Social Sciences 184, 47–103 (2008)

    Google Scholar 

  40. Lerman, I.C., Kuntz, P.: Directed binary hierarchies and directed ultrametrics. J. Classif. 28, 272–296 (2011)

    Article  MathSciNet  Google Scholar 

  41. Lerman, I.C., Leredde, H.: La méthode des pôles d’attraction. In: Diday, E., et al. (eds.) Journées Analyse des Données et Informatique. IRIA (1977)

    Google Scholar 

  42. Lerman, I.C., Peter, Ph.: Organisation et consultation d ’ une banque de petites annonces à partir d ’ une méthode de classification hiérarchique en parallèle. In: Data Analysis and Informatics IV, pp. 121–136. North Holland (1986)

    Google Scholar 

  43. Loughin, T.M.: A systematic comparison of methods for combining \(p\)-values from independent tests. Comput. Stat. Data Anal. 47, 467–485 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  44. McQuitty, L.L.: Similarity analysis by reciprocal pairs for discrete and continuous data. Educ. Psychol. Meas. 26, 825–831 (1966)

    Google Scholar 

  45. McQuitty, L.L.: Elementary linkage analysis for isolating orthogonal and oblique types and typal relevancies. Educ. Psychol. Meas. 17, 207–229 (1957)

    Google Scholar 

  46. Murtagh, F.: A survey of recent advances in hierarchical clustering. Comput. J. 26, 354–359 (1983)

    Article  MATH  Google Scholar 

  47. Murtagh, F.: A survey of algorithms for contiguity constrained clustering and related problems. Comput. J. 28, 82–88 (1985)

    Article  Google Scholar 

  48. Murtagh, F.: Clustering massive data sets. In: Handbook of Massive Data Sets, pp. 501–543. Kluwer Academic Publishers, Norwell (2002)

    Google Scholar 

  49. Nicolau, F.: Criterios de análise classificatoria hierarquica baseados na funçao de distribuiçao. Ph.D. thesis, Faculty of Science of Lisboa University, Feb 1981

    Google Scholar 

  50. Olson, C.F.: Parallel algorithms for hierarchical clustering. Parallel Comput. 21, 1313–1325 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  51. Orloci, L.: Information theory models for hierarchic and non-hierarchic classifications. In: Cole, A.J. (ed.) Numerical Taxonomy, pp. 148–164. Academic Press, New York (1968)

    Google Scholar 

  52. Peter, Ph.: Méthodes de classification hiérarchique et problèmes de structuration et de recherche d ’ informations assistée par ordinateur. Ph.D. thesis, Université de Rennes 1, mars 1987

    Google Scholar 

  53. Roux, M.: Deux algorithmes récents en classification automatique. Revue de Statistique Appliquée 18(4), 35–40 (1970)

    Google Scholar 

  54. Scott, A.J., Symons, M.J.: On the Edwards and Cavalli Sforza method of cluster analysis. Biometrics 27, 217–219 (1971)

    Article  Google Scholar 

  55. Sneath, P.H.A.: The application of computers to taxonomy. J. Gen. Microbiol. 17, 201–226 (1957)

    Article  Google Scholar 

  56. Sneath, P.H.A., Sokal, R.R.: Numerical Taxonomy. Freeman, San Francisco (1973)

    MATH  Google Scholar 

  57. Sokal, R.R., Michener, C.D.: A statistical method for evaluating systematic relationships. Kansas Univ. Sci. Bull. 38, 1409–1438 (1958)

    Google Scholar 

  58. Sorensen, T.: A method of establishing groups of equal amplitude in plant sociology based on similarity of species content. K. danske Vidensk. Selsk. Skr. (biol) 5, 1–34 (1948)

    Google Scholar 

  59. Ward, J.H.: Hierarchical grouping to optimise an objective function. J. Am. Stat. Assoc. 58, 238–244 (1963)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Israël César Lerman .

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer-Verlag London

About this chapter

Cite this chapter

Lerman, I.C. (2016). Building a Classification Tree. In: Foundations and Methods in Combinatorial and Statistical Data Analysis and Clustering. Advanced Information and Knowledge Processing. Springer, London. https://doi.org/10.1007/978-1-4471-6793-8_10

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-6793-8_10

  • Published:

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-4471-6791-4

  • Online ISBN: 978-1-4471-6793-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics