Abstract
Multivariate data visualization techniques are often limited in terms of the number of data records that can be simultaneously displayed in a manner that allows ready interpretation. Due to the size of the screen and number of pixels available, visualizing more than a few thousand data points generally leads to clutter and occlusion. This in turn restricts our ability to detect, classify, and measure phenomena of interest, such as clusters, anomalies, trends, and patterns. In this paper we describe our experiences in the development of multi-resolution visualization techniques for large multivariate data sets. By hierarchically clustering the data and displaying aggregation information for each cluster, we can examine the data set at multiple levels of abstraction. In addition, by providing powerful navigation and filtering operations, we can create an environment suitable for interactive exploration without overloading the user with dense information displays. In this paper, we illustrate that our hierarchical displays are general by successfully applying them to four popular yet non-scalable visualizations, namely parallel coordinates, glyphs, scatterplot matrices and dimensional stacking.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Andreae, P., Dawkins, B., and O’Connor, P. (1990). Dysect: An incremental clustering algorithm. Document included with public-domain version of the software, retrieved from Statlib at CMU.
Andrews, D. (1972). Plots of high dimensional data. Biometrics, Vol. 28, p. 125–36.
Chernoff, H. (1973). The use of faces to represent points in k-dimensional space graphically. Journal of the American Statistical Association, Vol. 68, p. 361–68.
Cleveland, W. and McGill, M. (1988). Dynamic Graphics for Statistics. Wadsworth, Inc.
Fua, Y., Ward, M., and Rundensteiner, E. (2000). Structure-based brushes: A mechanism for navigating hierarchically organized data and information spaces. IEEE Visualization and Computer Graphics, Vol. 6, No. 2, p. 150 –159.
Fua, Y., Ward, M., and Rundensteiner, E. (Oct. 1999a). Hierarchical parallel coordinates for exploration of large datasets. Proc. of Visualization ’99, p. 43–50.
Fua, Y., Ward, M., and Rundensteiner, E. (Oct. 1999b). Navigating hierarchies with structure-based brushes. Proc. of Information Visualization’99, p. 58–64.
Guha, S., Rastogi, R., and Shim, K. (June 1998). Cure: an efficient clustering algorithm for large databases. SIGMOD Record, vol.27(2), p. 73–84.
Inselberg, A. and Dimsdale, B. (1990). Parallel coordinates: A tool for visualizing multidimensional geometry. Proc. of Visualization ’90, p. 361–78.
Jain, K. and Dubes, C. (1988). Algorithms for Clustering Data. Prentice Hall.
Keim, D., Kriegel, H., and Ankerst, M. (1995). Recursive pattern: a technique for visualizing very large amounts of data. Proc. of Visualization ’95, p. 279–86.
LeBlanc, J., Ward, M., and Wittels, N. (1990). Exploring n-dimensional databases. Proc. of Visualization ’90, p. 230–7.
Martin, A. and Ward, M. (1995). High dimensional brushing for interactive exploration of multivariate data. Proc. of Visualization ’95, p. 271–8.
Ribarsky, W., Ayers, E., Eble, J., and Mukherjea, S. (1994). Glyphmaker: Creating customized visualization of complex data. IEEE Computer, Vol. 27(7), p. 57–64.
Ward, M. (1994). Xmdvtool: Integrating multiple methods for visualizing multivariate data. Proc. of Visualization ’94, P. 326–33.
Wegman, E. (1990). Hyperdimensional data analysis using parallel coordinates. Journal of the American Statistical Association, Vol. 411(85), p. 664.
Wegman, E. and Luo, Q. (1997). High dimensional clustering using parallel coordinates and the grand tour. Computing Science and Statistics, Vol. 28, p. 361–8.
Wong, P. and Bergeron, R. (1996).Multiresolution multidimensional wavelet brushing. Proc. of Visualization ’96, p. 141–8.
Yang, J., Ward, M., and Rundensteiner, E. (2001). Interactive hierarchical displays: A general framework for visualization and exploration of large multivariate data sets. Technical Report WPI-CS-TR-01-22.
Zhang, T., Ramakrishnan, R., and Livny, M. (June 1996). Birch: an efficient data clustering method for very large databases. SIGMOD Record, vol.25(2), p. 103–14.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Yang, J., Ward, M.O., Rundensteiner, E.A. (2003). Hierarchical Exploration of Large Multivariate Data Sets. In: Post, F.H., Nielson, G.M., Bonneau, GP. (eds) Data Visualization. The Springer International Series in Engineering and Computer Science, vol 713. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-1177-9_14
Download citation
DOI: https://doi.org/10.1007/978-1-4615-1177-9_14
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-5430-7
Online ISBN: 978-1-4615-1177-9
eBook Packages: Springer Book Archive