Skip to main content

Abstract

Many statistical techniques, particularly multivariate methodologies, focus on extracting information from data and proximity matrices. Rather than rely solely on numerical characteristics, matrix visualization allows one to graphically reveal structure in a matrix.This article reviews the history of matrix visualization, then gives a more detailed description of its general framework, along with some extensions. Possible research directions in matrix visualization and information mining are sketched. Color versions of figures presented in this article, together with software packages, can be obtained from http://gap.stat.sinica.edu.tw/.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bertin J. (1967). Semiologie graphique, Paris: Editions Gauthier-Villars. English translation by William J. Berg, as Semiology of Graphics: Diagrams, Networks, Maps. TheUniversity of Wisconsin Press, Madison, WI, 1983.

    Google Scholar 

  2. Carmichael J., Sneath P. (1969). Taxometric maps. Systematic Zoology 18, 402–415.

    Article  Google Scholar 

  3. Chang S.C., Chen C.H., Chi Y.Y., Ouyoung C.W. (2002). Relativity and resolution for high dimensional information visualization with generalized association plots (GAP). Proceedings in Computational Statistics 2002 (Compstat 2002), Berlin, Germany, 55–66.

    Google Scholar 

  4. Chen C. H. (1996). The properties and applications of the convergence of correlation matrices. In: 1996 Proceedings of the Section on Statistical computing, 49–54, American Statistical Association.

    Google Scholar 

  5. Chen C. H. (1999). Extensions of generalized association plots (GAP). In: 1999 Proceedings of the Section on Statistical Graphics,111–116, American Statistical Association.

    Google Scholar 

  6. Chen C. H. (2002). Generalized association plots: information visualization via iteratively generated correlation matrices. Statistica Sinica 12, 7–29.

    MATH  MathSciNet  Google Scholar 

  7. Chi Y. Y. (1999). Information visualization for comparing two sets of variables. Master Thesis. Division of Biomedical Statistics, Graduate Institute of Epidemiology, College of Public Health, National Taiwan University.

    Google Scholar 

  8. Chepoi V., Fichet B. (1997). Recognition of Robinsonian dissimilarities, Journal of Classification 14, 311–325.

    Article  MATH  MathSciNet  Google Scholar 

  9. Church K.W., Helfman J.I. (1993). Dotplot: a program for exploring selfsimilarity in millions of lines of text and code. Journal of Computational and Graphical Statistics 2, 153–174.

    Google Scholar 

  10. Cox T.F., Cox M. A.A. (2000). A general weighted two-way dissimilarity coefficient. Journal of Classification 17, 101–121.

    Article  MATH  MathSciNet  Google Scholar 

  11. Cox T.F., Cox M.A.A. (2001). Multidimensional scaling. 2nd ed. Chapman & Hall/CRC.

    Google Scholar 

  12. Eisen M.B., Spellman P.T., Brown P.O., Botstein B. (1998). Cluster analysis and display of genome-wide expression patterns. Proc. Nat’l. Acad. Sci. U. S. A. 95, 14863–14868.

    Article  Google Scholar 

  13. Encarnacao J., Fruhauf M. (1994). Global information visualization: the visualization challenge for the 21st Century, in Scientific Visualization Advances and Changes L. Rosenblum et al (eds), Academic Press.

    Google Scholar 

  14. Fisher R.A. (1936). The use of multiple measurements in axonomic problems. Annals of Eugenics 7, 179–188.

    Article  Google Scholar 

  15. Friendly M. (2002). Corrgrams: exploratory displays for correlation matrices. Amer. Statist 56, 316–324.

    Article  MathSciNet  Google Scholar 

  16. Friendly M., Kwan E. (2003). Effect ordering for data displays. Computational Statistics & Data Analysis 43, 509–539.

    Article  MATH  MathSciNet  Google Scholar 

  17. Gale N., Halperin C.W., Costanzo C.M. (1984). Unclassed matrix shading and optimal ordering in hierarchical cluster analysis. J. Classification 1, 75–92.

    Article  Google Scholar 

  18. Gower J.C. (1971). A general coefficient of similarity and some of its properties. Biometrics 27, 857–874.

    Article  Google Scholar 

  19. Hartigan J.A. (1972). Direct clustering of a data matrix. Journal of the American Statistical Association 67, 123–129.

    Article  Google Scholar 

  20. Huber P.J. (1985). Projection pursuit. The Annals of Statistics 13, 435–475.

    Article  MATH  MathSciNet  Google Scholar 

  21. Hubert L. (1976). Seriation using asymmetric proximity measures. British J. Math. Statist. Psych. 29, 32–52.

    Article  MATH  MathSciNet  Google Scholar 

  22. Hwu H.G., Chen C.H., Hwang T.J., Liu CM., Cheng J.J., Lin S.K., Liu S.K., Chen C.H., Chi Y.Y., Ouyoung C.W., Lin H.N., Chen W. J. 2002). Symptom patterns and subgrouping of schizophrenic patients: significance of negative symptoms assessed on admission. Schizophrenia Research 56, 105–119.

    Google Scholar 

  23. Kay S.R., Fiszbein A., Opler L.A. (1987). The positive and negative syndrome scale (PANSS) for schizophrenia. Schizophr. Bull. 13, 261–276.

    Article  Google Scholar 

  24. Kohonen T. (1995). Self-organizing maps. Berlin, Heidelberg: Springer.

    Book  Google Scholar 

  25. Lenstra J.K. (1974). Clustering a data array and the traveling salesman problem. Operations Research 22, 413–414.

    Article  MATH  Google Scholar 

  26. Li K.C (1991). Sliced inverse regression for dimensional reduction (with discussion). Journal of the American Statistical Association 86, 316–342.

    Article  MATH  MathSciNet  Google Scholar 

  27. Ling R.F. (1973). A computer generated aid for cluster analysis. Communications of the ACM 16, 355–361.

    Article  Google Scholar 

  28. Marchette D.J., Solka J.L. (2003). Using data images for outlier detection. Computational Statistics and Data Analysis 43, 541–552.

    Article  MATH  MathSciNet  Google Scholar 

  29. Marcotorchino F. (1991). Seriation problems: an overview. Applied Stochastic Models and Data Analysis 7, 139–151.

    Article  Google Scholar 

  30. Minnotte M., West W. (1998). The data image: a tool for exploring high dimensional data sets. In: 1998 Proceedings of the ASA Section on Statistical Graphics, Dallas, Texas, 25–33.

    Google Scholar 

  31. Murdoch D.J., Chow E.D. (1996). A graphical display of large correlation matrices. Statistical Computing 50, 178–180.

    Google Scholar 

  32. Robinson W. S. (1951). A method for chronologically ordering archaeological deposits. American Antiquity 16, 293–301.

    Article  Google Scholar 

  33. Slagel J.R., Chang C.L., Heller S.R. (1975). A clustering and data reorganizing algorithm. IEEE Transactions on Systems, Man, and Cybernetics 5, 125–128.

    Article  Google Scholar 

  34. Tukey J.W. (1977). Exploratory Data Analysis. Addison-Wesley.

    Google Scholar 

  35. Wegman E. (1990). Hyperdimensional data analysis using parallel coordinates. Journal of the American Statistical Association 85, 664–675.

    Article  Google Scholar 

  36. Ziv B.J., David K.G., Tommi S.J. (2001). Fast optimal leaf ordering for hierarchical clustering. Bioinformatics 17, S22–S29.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chen, CH. et al. (2004). Matrix Visualization and Information Mining. In: Antoch, J. (eds) COMPSTAT 2004 — Proceedings in Computational Statistics. Physica, Heidelberg. https://doi.org/10.1007/978-3-7908-2656-2_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-7908-2656-2_6

  • Publisher Name: Physica, Heidelberg

  • Print ISBN: 978-3-7908-1554-2

  • Online ISBN: 978-3-7908-2656-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics