Abstract
\(\mathcal{A}\) dataset with M items has 2M subsets anyone of which may be the one we really want. With a good data display our fantastic pattern-recognition ability can not only cut great swaths searching through this combinatorial explosion but also extract insights from the visual patterns. These are the core reasons for data visualization. With Parallel Coordinates (abbr.-cs) the search for multivariate relations in high dimensional datasets is transformed into a 2-D pattern recognition problem. Multidimensional exploration is illustrated on real datasets, in the process describing good query design with atomic queries and compound ones using boolean operations. Then complex datasets are classified with a geometric classification algorithm based on-cs. The algorithm has low computational complexity providing the classification rule explicitly and visually. The minimal set of variables required to state the rule is found and ordered by their predictive value. By means of a new divide and conquer technique the classification is extended to previously inaccessible datasets. This new result and others like the triad adjancency problem appear for the first time. A visual economic model of a real country is constructed and analyzed to illustrate how multivariate relations can be modeled by means of hypersurfaces. The overview at the end provides the foundational understanding for-cs, examples of exciting recent results like viewing convexity in any dimension, non-orientability (as in the Mbius strip) and a prelude of what is on the way: the discovery and display of relational information in high-dimensional datasets as visual patterns multidimensional graphs.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
The venerable name “Exploratory Data Analysis” EDA is used interchangeably with the currently more fashionable “Visual Data Mining”.
- 2.
MDG’s Ltd proprietary software – All Rights Reserved, is used by permission.
- 3.
Suggesting that the Landsat Thematic mapper band 4 filters out water though unknown to me.
- 4.
My dentist really liked this name!
- 5.
By S j ⊂ S k it is meant that the set of points enclosed in the hypersurface S j is contained in the set of points enclosed by the hypersurface S k .
References
Agarwal, R., Gehrke, J.E., Gunopoulos, D., Raghavan, P.: Automatic Subspace Clustering of High Dimensional data for Data Mining. USA Patent 6003029 (1999)
Bollobas, B.: Graph Theory. Springer, New York (1979)
Chatterjee, A.: Visualizing Multidimensional Polytopes and Topologies for Tolerances. Ph.D. thesis, Department of Computer Science, University of Southern California (1995)
Chatterjee, A., Das, P.P., Bhattacharya, S.: Visualization in linear programming using parallel coordinates. Pattern Recogn. 26-11, 1725–36 (1993)
Choi, H., Lee, H.: PCAV: Internet Attack Visualization in Parallel Coordinates, LNCS 3783, 454–466. Springer, New York (2005)
Chomut, T.: Exploratory Data Analysis in Parallel Coordinates. M.Sc. thesis, Department of Computer Science, UCLA (1987)
Cohan, S.M., Yang, D.C.H.: Mobility analysis in parallel coordinates. J. Mech. Mach. 21, 63–71 (1986)
Conti, G.: Security Data Visualization. No Starch Press, San Francisco (2007)
Desai, A., Walters, L.C.: Graphical representation of data envelopment analyses:management implications from parallel axes representations. Dec. Scien. 22(2), 335–353 (1991)
Eickemeyer, J.: Visualizing p-flats in N-space using Parallel Coordinates. Ph.D. thesis, Department of Computer Science, UCLA (1992)
Fiorini, P., Inselberg, A.: Configuration Space Representation in Parallel Coordinates. IEEE Conf. Rob. Aut. 1215–1220 (1989)
Friendly, M., al: Milestones in Thematic Cartography. www.math.yorku.ca/scs/SCS/Gallery/milestones/ (2005)
Gennings, C., Dawson, K.S., Carter, W.H., Myers, R.H.: Interpreting plots of a multidimensional dose-response surface in parallel coordinates. Biometrics 46, 719–35 (1990)
Han, J., Kamber, M.: Data Mining Concepts and Technology. Morgan-Kaufman, San Francisco (2001)
Harary, F.: Graph Theory. Addison-Wesley, Reading, Mass (1969)
Hauser, H.: Parallel Sets: Visual Analysis of Categorical Data. Proceedings of IEEE Infovis (2005)
Hung, C.K., Inselberg, A.: Parallel Coordinate Representation of Smooth Hypersurfaces. USC Tech. Report # CS - 92 -531, Los Angeles (1992)
Hung, C.K., Inselberg, A.: Description of Surfaces in Parallel Coordinates by Linked Planar Regions, Mathematics of Surfaces XII, 177-208, LNCS 4647. Springer, New York (2007)
Hurley, C.B., Olford, R.W.: Pairwise Display of High-Dimensional Information via Eulerian Tours and Hamiltonian Decompositions, Journal of Computational and Graphical Statistics 19(4), 861–886 (2010).
Inselberg, A.: The plane with parallel coordinates. Vis. Comput. 1, 69–97 (1985)
Inselberg, A.: Multidimensional Detective, in Proceedings of IEEE Information Visualization ’97, 100-107. IEEE Computer Society, Los Alamitos, CA (1997)
Inselberg, A.: Parallel Coordinates : VISUAL Multidimensional Geometry and its Applications. Springer, New York (2009)
Inselberg, A., Avidan, T.: The Automated Multidimensional Detective, In Proceedings of IEEE Information Visualization ’99, 112-119. IEEE Computer Society, Los Alamitos, CA (1999)
Inselberg, A., Avidan, T.: Classification and Visualization for High-Dimensional Data, In Proceedings of KDD, 370-4. ACM, New York (2000)
Inselberg, A., Boz, M., Dimsdale, B.: Planar Conflict Resolution Algorithm for Air-Traffic Control and the One-Shot Problem, in IBM PASC Tech. Rep. G320-3559. IBM Palo Alto Scientific Center (1991)
Inselberg, A., Dimsdale, B.: Parallel Coordinates: A Tool For Visualizing Multi-Dimensional Geometry, Proceedings of IEEE Conference on Visualization, 361-378. IEEE Computer Society, Los Alamitos, CA (1990)
Inselberg, A., Reif, M., Chomut, T.: Convexity algorithms in parallel coordinates. J. ACM 34, 765–801 (1987)
Jones, C.: Visualization and Optimization. Kluwer Academic Publishers, Boston (1996)
Matskewich, T., Inselberg, A., Bercovier, M.: Approximated Planes in Parallel Coordinates. In Proceedings of Geometry Modeling Conference, St. Malo, Vanderbilt University Press, 257–266 (2000)
Schmid, C., Hinterberger, H.: Comparative Multivariate Vis. Across Conceptually Different Graphic Displays, in Proceedings of 7th SSDBM. IEEE Computer Society, Los Alamitos, CA (1994)
Theus, M., Urbanek, S.: Interactive Graphics for Data Analysis. CRC Press, Boca Raton FL (2009)
Tufte, E.R.: Visual Explanation. Graphic Press, Connecticut (1996)
UCI. Machine Learning Database Repository at. www.ics.uci.edu/~mlearn/MLRepository.html.
Ward, M.O.: XmdvTool: integrating multiple methods for visualizing multivariate data, Proceedings IEEE Conference on Visualization, CA, 326-333. IEEE Computer Society, Los Alamitos, CA (1994)
Wegman, E.: Hyperdimensional data analysis using parallel coordinates. J. Am. Stat. Assoc. 85, 664–675 (1990)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Inselberg, A. (2012). Discovering and Visualizing Relations in High Dimensional Data. In: Gentle, J., Härdle, W., Mori, Y. (eds) Handbook of Computational Statistics. Springer Handbooks of Computational Statistics. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21551-3_11
Download citation
DOI: https://doi.org/10.1007/978-3-642-21551-3_11
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21550-6
Online ISBN: 978-3-642-21551-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)