Advertisement

Computational Statistics

, Volume 19, Issue 1, pp 147–158 | Cite as

Hierarchical visual data mining for large-scale data

  • Matthew Ward
  • Wei Peng
  • Xiaoning Wang
Article

Summary

An increasingly important problem in exploratory data analysis and visualization is that of scale; more and more data sets are much too large to analyze using traditional techniques, either in terms of the number of variables or the number of records. One approach to addressing this problem is the development and use of multiresolution strategies, where we represent the data at different levels of abstraction or detail through aggregation and summarization. In this paper we present an overview of our recent and current activities in the development of a multiresolution exploratory visualization environment for large-scale multivariate data. We have developed visualization, interaction, and data management techniques for effectively dealing with data sets that contain millions of records and/or hundreds of dimensions, and propose methods for applying similar approaches to extend the system to handle nominal as well as ordinal data.

Keywords

Dimension Cluster Multiple Correspondence Analysis Information Visualization Interactive Exploration Dimension Hierarchy 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    D. Brodbeck, M. Chalmers, A. Lunzer, and P. Cotture, Domesticating bead: Adapting an information visualization system to a financial institution., Proc. Information Visualization 97, pp. 73–80, 1997.Google Scholar
  2. [2]
    Y. Fua, M. O. Ward, and E. A. Rundensteiner, Hierarchical parallel coordinates for visualizing large multivariate data sets, Proc. Visualization 99, pp. 43–50, October, 1999.Google Scholar
  3. [3]
    Y. Fua, M. O. Ward, and E. A. Rundensteiner, Navigating hierarchies with structure-based brushes, Proc. Information Visualization 99, pp. 58–64, October, 1999.Google Scholar
  4. [4]
    Y. Fua, M. O. Ward, and E. A. Rundensteiner, Structure-based brushes: a mechanism for navigating hierarchically organized data and information spaces, IEEE Trans. Visualization and Computer Graphics, V. 6, pp. 150–159, 2000.CrossRefGoogle Scholar
  5. [5]
    J. Jolliffe. Principal Component Analysis. Springer Verlag, 1986.Google Scholar
  6. [6]
    S. Kaski, Dimensionality reduction by random mapping: Fast similarity computation for clustering, Proc. IJCNN, p. 413–418, 1998.Google Scholar
  7. [7]
    D. Keim, H. Kriegel, and M. Ankerst, Recursive pattern: a technique for visualizing very large amounts of data, Proc. of Visualization 95, p. 279–86, 1995.Google Scholar
  8. [8]
    T. Kohonen. Self Organizing Maps. Springer Verlag, 1995.Google Scholar
  9. [9]
    A. Mead, Review of the development of multidimensional scaling methods, The Statistician, Vol. 33, p. 27–35, 1992.CrossRefGoogle Scholar
  10. [10]
    B. Shneiderman, Tree visualization with tree-maps: A 2d space-filling approach, ACM Transactions on Graphics, Vol. 11(1), p. 92–99, Jan. 1992.MATHCrossRefGoogle Scholar
  11. [11]
    D. Stroe, E. A. Rundensteiner, and M. O. Ward, Scalable visual hierarchy exploration, Proc. DEXA 2000, September, 2000.Google Scholar
  12. [12]
    M. O. Ward, XmdvTool: integrating multiple methods for visualizing multivariate data. Proc. Visualization 94, pp. 326–333, October, 1994.CrossRefGoogle Scholar
  13. [13]
    M. O. Ward and A. R. Martin, High dimensional brushing for interactive exploration of multivariate data. Proc. Visualization 95, pp. 271–278, 1995.Google Scholar
  14. [14]
    M. O. Ward, Creating and manipulating N-dimensional brushes, Proc. Joint Statistical Meeting, pp. 6–14, August, 1997.Google Scholar
  15. [15]
    M. O. Ward, Y. Jing, and E. A. Rundensteiner, Hierarchical exploration of large multivariate data spaces, Proc. Dagstuhl Seminar on Scientific Visualization, May, 2000.Google Scholar
  16. [16]
    G. Wills, An interactive view for hierarchical clustering, Proc. Information Visualization 98, p. 26–31, 1998.Google Scholar
  17. [17]
    J. A. Wise, J. J. Thomas, K. Pennock, D. Lantrip, M. Pottier, A. Schur, and V. Crow, Visualizing the non-visual: Spatial analysis and interaction with information from text documents, Proc. Information Visualization 95, pp. 51–58, 1995.Google Scholar
  18. [18]
    J. A. Wise, The ecological approach to text visualization, JASIS, Vol. 50, No. 13, p. 1224–1233, 1999.CrossRefGoogle Scholar
  19. [19]
    E. Wegman and Q. Luo, High dimensional clustering using parallel coordinates and the grand tour, Computing Science and Statistics, Vol. 28, p. 361–8, 1997.Google Scholar
  20. [20]
    P. Wong and R. Bergeron, Multiresolution multidimensional wavelet brushing, Proc. Visualization 96, p. 141–8, 1996.Google Scholar
  21. [21]
    J. Yang, M. Ward, and E. Rundensteiner, Interactive hierarchical displays: a general framework for visualization and exploration of large multivariate data sets, Computers and Graphics, in press.Google Scholar
  22. [22]
    J. Yang, M. Ward, and E. Rundensteiner, Interring: a radial, spacefilling hierarchy visualization system with a set of navigation, modification, and selection tools, IEEE Symposium on Information Visualization (InfoVis 02), pp. 77–84, 2002.Google Scholar
  23. [23]
    J. Yang, M. O. Ward and E. A. Rundensteiner, “Visual hierarchical dimension reduction for exploration of high dimensional datasets, Technical Report #WPI-CS-TR-02-22, 2002.Google Scholar
  24. [24]
    J. York, S. Bohn, K. Pennock, and D. Lantrip, Clustering and dimensionality reduction in Spire. Proc. Symposium on Advanced Intelligence Processing and Analysis, p. 73, 1995.Google Scholar

Copyright information

© Physica-Verlag 2004

Authors and Affiliations

  • Matthew Ward
    • 1
  • Wei Peng
    • 1
  • Xiaoning Wang
    • 1
  1. 1.Computer Science DepartmentWorcester Polytechnic InstituteWorcesterUSA

Personalised recommendations