Abstract
We propose in this paper a visualization approach for large online databases using the Hilbert space–filling curve to map N–dimensional data points to 2D or 3D points. Dimensionality reduction methods like principal component analysis (PCA), multi dimensional scaling (MDS) or self organizing maps (SOMS) can map N–dimensional data points with N>>3 into 3 dimensional or 2 dimensional values that allow us to visualize the data. These methods although popular, require either the calculation of a scatter matrix, eigenvalues and eigenvectors, or the iteration of learning algorithms. Therefore these methods cannot perform online, can be slow with large databases and always produce information loss when the data is mapped from the multidimensional space to the 2D or 3D image. Space–filling curves like the Peano, Z, and Hilbert curve, on the contrary, produce a 1–to–1 mapping between points in a line segment and an arbitrary N–Dimensional hypercube. This 1–to–1 mapping guarantees that there is no information loss on the transformation. Specifically the Hilbert space–filling curve is known to preserve the Lebesgue measure and has been proven to produce an optimal mapping in the sense that an arbitrary contiguous block of information will receive the minimum number of splits in the mapped space. The Hilbert space–filling curve has been extensively used for indexing and clustering by mapping N–dimensional data points to 1–dimensional values. We propose here to use the curve to map to 2 or 3 dimensions for purposes of visualization: By taking advantage of its 1–to–1 nature, a new and generic method to map N–dimensional data points to 2D or 3D points using the Hilbert space–filling curve is developed. We prove theoretically that the calculation of the mapping can be done in constant time if we fix the order of approximation, thereby giving linear O(n) performance on the number of data points to map. We create a Hilbert space–filling curve visualization tool that is much faster than the other methods mentioned and allows us to generate quickly for very large datasets various different visualizations of the data, thereby compensating the lack of use of statistical information in the calculation of the mapped points. We compare our approach to MDS and PCA with a benchmark data set and three real datasets using the distance preserving and topology preserving measure as benchmarks. Our experiments indicate that the Hilbert space–filling curve produces acceptable quality of mapping while achieving much faster visualization and is therefore especially useful for online visualization of very large data sets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Butz, A.R.: Space filling curves and mathematical programming. Information and Control 12, 314–330 (1968)
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley & Sons, Chichester (2000)
Estévez, P.A., Figueroa, C.J., Saito, K.: Cross-Entropy Approach to Data Visualization Based on the Neural Gas Network. In: IJCNN 2005, Montreal, Canada (2005)
Keim, D.: Enhancing the visual clustering of Query-Dependent Database Visualization Techniques Using Screen-Filling Curves. In: Wierse, A., Lang, U., Grinstein, G.G. (eds.) Database Issues for Data Visualization. LNCS, vol. 1183, Springer, Heidelberg (1996)
König, A.: A Survey of Methods for Multivariate Data Projection, Visualisation and Interactive Analysis. Dresden University of Technology, Germany (1998)
König, A.: Interactive visualization and Analysis of Hierarchical Neural Projections for Data Mining. IEEE Transactions on Neural Networks 11(3) (2000)
Lawder, J.K.: Calculations of Mappings Between One and n–dimensional Values Using the Hilbert Space–filling Curve. Technical Report no. JL1/00, August 15 (2000)
Lerner, B., Guterman, H., Aladjem, M., Dinstein, I.: A comparative study of neural network based feature extraction paradigms. Pattern Recognition Letters 20(1), 7–14 (1999a)
Lerner, B., Guterman, H., Aladjem, M., Dinstein, I.: Feature Extraction by Neural Network Nonlinear Mapping for Pattern Classification. In: ICPR13, Vienna, vol. 4, pp. 320–324 (1996)
Lerner, B., Guterman, H., Aladjem, M., Dinstein, I., Romem, Y.: On pattern classification with Sammon’s nonlinear mapping - an experimental study. Pattern Recognition 31, 371–381 (1998b)
Lerner, B., Guterman, H., Aladjem, M., Dinstein, I.: On the Initialisation of Sammon’s Nonlinear Mapping. Pattern Analysis and Applications 3(1) (2000)
Mao, J., Jain, A.K.: Artificial neural networks for feature extraction and multivariate data projection. IEEE Trans. Neural Networks 6 (1995)
Mokbel, M.F., Aref, W.G., Kamel, I.: Performance of Multi Dimensional Space–Filling Curves. In: Proceedings of the 10th ACM symposium on Advances in geographic information systems, ACM Press, New York (2002)
Mokbel, M.F., Aref, W.G., Kamel, I.: Fast and effective characterization of 3D Regions of Interest in medical image data. In: Medical Imaging 2004: Image Processing. Proceedings of SPIE, vol. 5370 (2004)
Moon, B., Jagadish, H.V., Faloustos, C., Saltz, J.J.: Analysis of the Clustering Properties of the Hilbert space–filling Curve. IEEE Transactions on Knowledge and Data Engineering 13(1) (2001)
Pekalska, E., de Ridder, D., Duin, R.P.W., Kraaijveld, M.A.: A new method of generalizing Sammon mapping with application to algorithm speed-up. Delft University of Technology, The Netherlands (1999)
de Ridder, D., Duin, R.P.W.: Sammon’s mapping using neural networks: A comparison. Pattern Recognition Letters 18, 1307–1316 (1997)
Wattenberg, M.: A Note on Space-Filling Visualizations and Space-Filling Curves. In: INFOVIS (2005)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Castro, J., Burns, S. (2007). Online Data Visualization of Multidimensional Databases Using the Hilbert Space–Filling Curve. In: Lévy, P.P., et al. Pixelization Paradigm. VIEW 2006. Lecture Notes in Computer Science, vol 4370. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71027-1_9
Download citation
DOI: https://doi.org/10.1007/978-3-540-71027-1_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71026-4
Online ISBN: 978-3-540-71027-1
eBook Packages: Computer ScienceComputer Science (R0)