Abstract
The data is defined by the observation of a set \(\mathcal {A}\) of descriptive attributes on a set \(\mathcal {O}\) of elementary objects. As indicated in the introduction of the preceding chapter (see Sect. 5.1 of Chap. 5) \(\mathcal {A}\) is constituted of attributes of a same type belonging to the general type II (see Sect. 3.3 of Chap. 3). To fix ideas in this introduction, we may imagine \(\mathcal {A}\) as composed of nominal categorical attributes. The different comparison cases are listed at the beginning of the following Section (see Sect. 6.2). For this comparison, as expressed in the introductive Sect. 5.1 of Chap. 5, the LLA approach will be emphasized. It leads, in a unified process, to a very rich family of probabilistic association coefficients between descriptive attributes of any type. On the other hand, the principle of this method enables several association coefficients to be mutually compared.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Albatineh, A.N., Niewiadomska-Bugaj, M.: Correcting jaccard and other similarity indices for chance agreement in cluster analysis. Adv. Data Anal. Class. 5, 179–200 (2011)
Albatineh, A.N., Niewiadomska-Bugaj, M., Mihalko, D.: On similarity indices and correction for chance agreement. J. Class. 23, 301–313 (2006)
Booth, H.S., Maindonald, J.H., Wilson, S.R., Gready, J.E.: An efficient z-score algorithm for assessing sequence alignments. J. Comput. Biol. 11(4), 616–625 (2004)
Cramer, H.: The Elements of Probability Theory and Some of Its Applications. Wiley, New York (1946)
Daniels, H.E.: The relation between measures of correlation in the universe of sample permutations. Biometrika 33, 129–135 (1944)
Daudé, F.: Analyse et justification de la notion de ressemblance entre variables qualitatives dans l’optique de la classification hiérarchique par \(AVL\). Ph.D. thesis, Université de Rennes 1, June 1992
Davis, J.A.: A partial coefficient for goodman and Kruskal’s gamma. J. Am. Stat. Assoc. 62(317), 189–193 (1967)
Fowlkes, E.B., Mallows, C.L.: A method for comparing two hierarchical clusterings. J. Am. Stat. Assoc. 78, 553–569 (1983)
Goodman, L.A., Kruskal, W.H.: Measures of association for cross classifications. J. Am. Stat. Assoc. 49, 732–764 (1954)
Haigh, J.: A neat way to prove asymptotic normality. Biometrika 3, 677–678 (1971)
Hajek, J.: Some extensions of the Wald-Wolfowitz-Noether theorem. Ann. Math. Stat. 32, 506–523 (1961)
Hubert, L., Arabie, P.: Comparing partitions. J. Classif. 2, 193–218 (1985)
Hubert, L.J.: Inference procedures for the evaluation and comparison of proximity matrices. In: Felsenstein, J. (ed.) Numerical Taxonomy. Springer, Berlin (1983)
Hubert, L.J.: Combinatorial data analysis: association and partial association. Psychometrika 50(4), 449–467 (1985)
Hubert, L.J.: Assignment methods in combinatorial data analysis. Numerical Taxonomy. Marcel Dekker, New York (1987)
Hulsen, T., de Vlieg, J., Leunissen, J., Groenen, P.: Testing statistical significance with structure similarity. BMC Bioinf. 7(444), 1 (2006). Online
Kendall, M.G.: Rank correlation methods. Charles Griffin, London (1970). First edition in 1948
Lecalvé, G.: Un indice de similarité pour des variables de types quelconques. Statistique et Analyse des Données 01–02, 39–47 (1976)
Lerman, I.C.: Étude distributionnelle de statistiques de proximité entre structures finies de même type; application à la classification automatique. Cahiers du Bureau Universitaire de Recherche Opérationnelle 19 1–52 (1973)
Lerman, I.C.: Formal analysis of a general notion of proximity between variables. In: Barra, J.R., et al. (eds.) Recent Developments in Statistics, pp. 787–795. North-Holland, New York (1977)
Lerman, I.C.: Classification et analyse ordinale des données. Dunod and http://www.brclasssoc.org.uk/books/index.html (1981)
Lerman, I.C.: Indices d’association partielle entre variables qualitatives nominales. RAIRO série verte 17(3), 213–259 (1983)
Lerman, I.C.: Indices d’association partielle entre variables qualitatives ordinales. Publications Institut de Statistique des Universités de Paris, (XXVIII, 1,2), 7–46 (1983)
Lerman, I.C.: Justification et validité d’une échelle \([0, 1]\) de fréquence mathématique pour une structure de proximité sur un ensemble de variables observées. Publications de l’Institut de Statistique des Universités de Paris 29, 27–57 (1984)
Lerman, I.C.: Maximisation de l’association entre deux variables qualitatives ordinales. Mathématiques et Sciences Humaines 100, 49–56 (1987)
Lerman, I.C.: Comparing partitions (mathematical and statistical aspects). In: Bock, H.H. (ed.) Classification and Related Methods of Data Analysis, pp. 121–131. North-Holland, Amsterdam (1988)
Lerman, I.C.: Conception et analyse de la forme limite d ’ une famille de coefficients statistiques d ’ association entre variables relationnelles, i. Revue Mathématique Informatique et Sciences Humaines 118, 35–52 (1992)
Lerman, I.C.: Conception et analyse de la forme limite d ’ une famille de coefficients statistiques d ’ association entre variables relationnelles, ii. Revue Mathématique Informatique et Sciences Humaines 119, 75–100 (1992)
Lerman, I.C.: Comparing classification tree structures: a special case of comparing q-ary relations. RAIRO-Oper. Res. 33, 339–365 (1999)
Lerman, I.C.: Comparing taxonomic data. Revue Mathématiques et Sciences Humaines 150, 37–51 (2000)
Lerman, I.C., Peter, P.: Structure maximale pour la somme des carrés d’une contingence aux marges fixées; une solution algorithmique programmée. Revue française d’automatique, d’informatique et de recherche opérationnelle 22(2), 83–136 (1988)
Lerman, I.C., Peter, P., Risler, J.L.: Matrices AVL pour la classification et l’alignement de séquences protéiques. Research Report 2466, IRISA-INRIA, September 1994
Lerman, I.C., Rouxel, F.: Comparing classification tree structures: a special case of comparing q-ary relations ii. RAIRO-Oper. Res. 34, 251–281 (2000)
Mantel, N.: Detection of disease clustering and a generalized approach. Cancer Res. 27(2), 209–220 (1967)
Messatfa, H.: An algorithm to maximize the agreement between partitions. J. Classif. 9(1), 5–15 (1992)
Mielke, P.W.: On asymptotic non-normality of null distributions of MRPP statistics. In: Communications in Statistics, Theory and Methods, pp. A8:1541–1550 (1979)
Monjardet, B.: Concordance between two linear orders: The Spearman and Kendall coefficients revisited. J. Classif. 14, 269–295 (1997)
Motoo, M.: On the Hoeffding’s combinatorial central limit theorem. Ann. Inst. Stat. Math. 8, 145–154 (1957)
Noether, G.: On a theorem by Wald and Wolfowitz. Ann. Math. Stat. 20, 455–458 (1949)
Ouali-Allah, M.: Analyse en préordonnance des données qualitatives. Application aux données numériques et symboliques. Ph.D. thesis, Université de Rennes 1, Decembre 1991
Pinto Da Costa, J.F., Roque, L.A.C.: Limit distribution for the weighted rank correlation coefficient, \(r_{W}\). REVSTAT - Stat. J. 3, 189–200 (2006)
Somers, R.H.: Analysis of partial rank correlation measures based on the product-moment model: Part one. Social Forces 53(2), 229–246 (1974)
Spearman, C.: The proof and measurement of association between two things. Am. J. Psychol. 15(1), 72–101 (1904)
Steinley, D., Hendrickson, G., Brusco, M.J.: A note on maximizing the agreement between partitions: a stepwise optimal algorithm and some properties. J. Classif. 32, 114–126 (2015)
Tshuprow, A.A.: Principles of the Mathematical Theory of Correlation (trans: Kantorowitsch, M). W. Hodge and Co, London (1939)
Villoing, P.: Classification ascendante hiérarchique et indices de similarité sur données qualitatives nominales selon l’algorithme de la vraisemblance de la vraisemblance du lien. Ph.D. thesis, Université de Rennes 1, December 1980
Wald, A., Wolfowitz, J.: Statistical tests based on permutations of the observations. Ann. Math. Stat. 15, 358–372 (1944)
Wilson, E.B., Hilferty, MM: The distribution of chi-square. In: Proceedings of the National Academy of Sciences of the United States of America, vol. 17, pp. 684–688 (1931)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2016 Springer-Verlag London
About this chapter
Cite this chapter
Lerman, I.C. (2016). Comparing Attributes by a Probabilistic and Statistical Association II. In: Foundations and Methods in Combinatorial and Statistical Data Analysis and Clustering. Advanced Information and Knowledge Processing. Springer, London. https://doi.org/10.1007/978-1-4471-6793-8_6
Download citation
DOI: https://doi.org/10.1007/978-1-4471-6793-8_6
Published:
Publisher Name: Springer, London
Print ISBN: 978-1-4471-6791-4
Online ISBN: 978-1-4471-6793-8
eBook Packages: Computer ScienceComputer Science (R0)