Skip to main content

ZigZag, a New Clustering Algorithm to Analyze Categorical Variable Cross-Classification Tables

  • Conference paper

Part of the Lecture Notes in Computer Science book series (LNAI,volume 1704)

Abstract

This Paper proposes ZigZag, a new clustering algorithm, that works on categorical variable Cross-classification tables. Zigzag creates simultaneously two partitions of row and column categories in accordance with the equivalence relation ”to have the Same conditional mode” . These two partitions are associated one to one and onto, creating by that way row-column clusters. Thus, we have an efficient KDD tool which we tan apply to any database. Moreover, ZigZag visualizes predictive association for nominal data in the sense of Guttman, Goodman and Kruskal. Accordingly, the prediction rule of a nominal variable Y conditionally to an other X consists in choosing the conditionally most probable category of Y when knowing X and the power of this rule is evaluated by the mean proportional reduction in error denoted by λ Y/X . It would appear then that the mapping furnished by ZigZag plays for nominal data the Same role as the scattered diagram and the curves of conditional means or the straight regression line plays for quantitative data, the first increased with the values of λ Y/X and λ X/Y , the second increased with the correlation ratio or the R2.

References

  1. Agresti, A.: Categorical Data Analysis. John Wiley, New-York (1990)

    MATH  Google Scholar 

  2. Bergeron, S., Lallich, S., Le Bas, C.: Location of Inovative Activities and Technological Structure in the French Economy, 1985-90: Some Evidences from U.S patenting, Research Policy, 26, pp. 733-751 (1998)

    Google Scholar 

  3. Chauchat, J.H., Risson, A.: Bertin’s Graphics and Multidimensional Data Analysis. In: Blasius, J., Greenacre, M. (eds.) Visualization of Categorical Data. Academic Press, London (1998)

    Google Scholar 

  4. Goodman, L.A., Kruskal, W.H.: Measures of Association for Cross-Classifications I. JASA 49, 732–764 (1954)

    MATH  Google Scholar 

  5. Guillien, F.: Mise en oeuvre de ZigZag sous Delphi: Datamix, Mèmoire de Maîtrise Sciences Economiques, Université Lumière Lyon 2 (1998)

    Google Scholar 

  6. Lallich, S.: Concept de diversite et association predictive, SFDS 99, Grenoble (1999)

    Google Scholar 

  7. Rakotomalala, R., Lallich, S.: Handling Noise with Generalized Entropy of Type Beta in Induction Graphs Algorithms, JCIS, Duke, USA (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 1999 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lallich, S. (1999). ZigZag, a New Clustering Algorithm to Analyze Categorical Variable Cross-Classification Tables. In: Żytkow, J.M., Rauch, J. (eds) Principles of Data Mining and Knowledge Discovery. PKDD 1999. Lecture Notes in Computer Science(), vol 1704. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-48247-5_49

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-48247-5_49

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-66490-1

  • Online ISBN: 978-3-540-48247-5

  • eBook Packages: Springer Book Archive