Skip to main content

Hierarchical Exploration of Large Multivariate Data Sets

  • Chapter

Part of the book series: The Springer International Series in Engineering and Computer Science ((SECS,volume 713))

Abstract

Multivariate data visualization techniques are often limited in terms of the number of data records that can be simultaneously displayed in a manner that allows ready interpretation. Due to the size of the screen and number of pixels available, visualizing more than a few thousand data points generally leads to clutter and occlusion. This in turn restricts our ability to detect, classify, and measure phenomena of interest, such as clusters, anomalies, trends, and patterns. In this paper we describe our experiences in the development of multi-resolution visualization techniques for large multivariate data sets. By hierarchically clustering the data and displaying aggregation information for each cluster, we can examine the data set at multiple levels of abstraction. In addition, by providing powerful navigation and filtering operations, we can create an environment suitable for interactive exploration without overloading the user with dense information displays. In this paper, we illustrate that our hierarchical displays are general by successfully applying them to four popular yet non-scalable visualizations, namely parallel coordinates, glyphs, scatterplot matrices and dimensional stacking.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Andreae, P., Dawkins, B., and O’Connor, P. (1990). Dysect: An incremental clustering algorithm. Document included with public-domain version of the software, retrieved from Statlib at CMU.

    Google Scholar 

  • Andrews, D. (1972). Plots of high dimensional data. Biometrics, Vol. 28, p. 125–36.

    Article  Google Scholar 

  • Chernoff, H. (1973). The use of faces to represent points in k-dimensional space graphically. Journal of the American Statistical Association, Vol. 68, p. 361–68.

    Article  Google Scholar 

  • Cleveland, W. and McGill, M. (1988). Dynamic Graphics for Statistics. Wadsworth, Inc.

    Google Scholar 

  • Fua, Y., Ward, M., and Rundensteiner, E. (2000). Structure-based brushes: A mechanism for navigating hierarchically organized data and information spaces. IEEE Visualization and Computer Graphics, Vol. 6, No. 2, p. 150 –159.

    Article  Google Scholar 

  • Fua, Y., Ward, M., and Rundensteiner, E. (Oct. 1999a). Hierarchical parallel coordinates for exploration of large datasets. Proc. of Visualization ’99, p. 43–50.

    Google Scholar 

  • Fua, Y., Ward, M., and Rundensteiner, E. (Oct. 1999b). Navigating hierarchies with structure-based brushes. Proc. of Information Visualization’99, p. 58–64.

    Google Scholar 

  • Guha, S., Rastogi, R., and Shim, K. (June 1998). Cure: an efficient clustering algorithm for large databases. SIGMOD Record, vol.27(2), p. 73–84.

    Article  Google Scholar 

  • Inselberg, A. and Dimsdale, B. (1990). Parallel coordinates: A tool for visualizing multidimensional geometry. Proc. of Visualization ’90, p. 361–78.

    Google Scholar 

  • Jain, K. and Dubes, C. (1988). Algorithms for Clustering Data. Prentice Hall.

    MATH  Google Scholar 

  • Keim, D., Kriegel, H., and Ankerst, M. (1995). Recursive pattern: a technique for visualizing very large amounts of data. Proc. of Visualization ’95, p. 279–86.

    Google Scholar 

  • LeBlanc, J., Ward, M., and Wittels, N. (1990). Exploring n-dimensional databases. Proc. of Visualization ’90, p. 230–7.

    Google Scholar 

  • Martin, A. and Ward, M. (1995). High dimensional brushing for interactive exploration of multivariate data. Proc. of Visualization ’95, p. 271–8.

    Google Scholar 

  • Ribarsky, W., Ayers, E., Eble, J., and Mukherjea, S. (1994). Glyphmaker: Creating customized visualization of complex data. IEEE Computer, Vol. 27(7), p. 57–64.

    Article  Google Scholar 

  • Ward, M. (1994). Xmdvtool: Integrating multiple methods for visualizing multivariate data. Proc. of Visualization ’94, P. 326–33.

    Google Scholar 

  • Wegman, E. (1990). Hyperdimensional data analysis using parallel coordinates. Journal of the American Statistical Association, Vol. 411(85), p. 664.

    Article  Google Scholar 

  • Wegman, E. and Luo, Q. (1997). High dimensional clustering using parallel coordinates and the grand tour. Computing Science and Statistics, Vol. 28, p. 361–8.

    Google Scholar 

  • Wong, P. and Bergeron, R. (1996).Multiresolution multidimensional wavelet brushing. Proc. of Visualization ’96, p. 141–8.

    Google Scholar 

  • Yang, J., Ward, M., and Rundensteiner, E. (2001). Interactive hierarchical displays: A general framework for visualization and exploration of large multivariate data sets. Technical Report WPI-CS-TR-01-22.

    Google Scholar 

  • Zhang, T., Ramakrishnan, R., and Livny, M. (June 1996). Birch: an efficient data clustering method for very large databases. SIGMOD Record, vol.25(2), p. 103–14.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Yang, J., Ward, M.O., Rundensteiner, E.A. (2003). Hierarchical Exploration of Large Multivariate Data Sets. In: Post, F.H., Nielson, G.M., Bonneau, GP. (eds) Data Visualization. The Springer International Series in Engineering and Computer Science, vol 713. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-1177-9_14

Download citation

  • DOI: https://doi.org/10.1007/978-1-4615-1177-9_14

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4613-5430-7

  • Online ISBN: 978-1-4615-1177-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics