Abstract
Symbolic data extend the classical tabular model, where each individual, takes exactly one value for each variable by allowing multiple, possibly weighted, values for each variable. New variable types - interval-valued, categorical multi-valued and modal variables - have been introduced, which allow representing variability and/or uncertainty inherent to the data. But are we still in the same framework when we allow for the variables to take multiple values? Are the definitions of basic notions still so straightforward? What properties remain valid? In this paper we discuss some issues that arise when trying to apply classical data analysis techniques to symbolic data. The central question of the measurement of dispersion, and the consequences of different possible choices in the design of multivariate methods will be addressed.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
BOCK, H.-H. and DIDAY, E. (2000): Analysis of Symbolic Data, Exploratory methods for extracting statistical information from complex data. Springer-Verlag, Berlin-Heidelberg.
BRITO, P. (1991): Analyse de Données Symboliques. Pyramides d’Héritage. PhD Thesis, Mathématiques de la Décision, Univ. Paris-IX Dauphine.
BRITO, P. (1994): Use of pyramids in symbolic data analysis. In: E. Diday, et al. (Eds.): New Approaches in Classification and Data Analysis. Springer-Verlag, Berlin-Heidelberg, 378–386.
BRITO, P. (1998): Symbolic clustering of probabilistic data. In: A. Rizzi, M. Vichi and H.-H. Bock (Eds.): Advances in Data Science and Classification. Springer, Berlin, 185–190.
BRITO, P. and DE CARVALHO, F.A.T. (1999): Symbolic clustering in the presence of hierarchical rules. In: Studies and Research, Proceedings of the Conference on Knowledge Extraction and Symbolic Data Analysis (KESDA’98). Office for Official Publications of the European Communities, Luxembourg, 119–128.
BRITO, P. and DE CARVALHO, F.A.T. (2002): Symbolic clustering of constrained probabilistic data. In: O. Opitz and M. Schwaiger (Eds.): Exploratory Data Analysis in Empirical Research. Springer-Verlag, Heidelberg, 12–21.
BRITO, P. and DE CARVALHO, F.A.T. (2007): Hierarchical and pyramidal clustering. In: E. Diday and M. Noirhomme-Fraiture (Eds.): Symbolic Data Analysis and the SODAS Software. Wiley, London (in press).
CHAVENT, M. (2005): Normalized k-means clustering of hyper-rectangles. In: Proceedings of the XIth International Symposium of Applied Stochastic Models and Data Analysis (ASMDA 2005), Brest, France, 670–677.
DE CARVALHO, F.A.T., BRITO, P. and BOCK, H.-H. (2006): Dynamic clustering for interval data based on L2 distance. Computational Statistics, 21(2), 231–250.
DIDAY, E. (1988): The symbolic approach in clustering and related methods of data analysis: the basic choices. In: H.H. Bock (Ed.), Classification and Related Methods of Data Analysis, Proc. of IFCS’87, Aachen, July 1987. North Holland, Amsterdam, 673–684.
DIDAY, E. (1989): Introduction à l’analyse des données symboliques. Revue de Recherche Opérationnelle, 23(2), 193–236.
DUARTE SILVA, A.P. and BRITO, P. (2006): Linear discriminant analysis for interval data. Computational Statistics 21(2), 289–308.
MOORE, R.E. (1966): Interval Analysis. Prentice Hall, New Jersey.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Brito, P. (2007). On the Analysis of Symbolic Data. In: Brito, P., Cucumel, G., Bertrand, P., de Carvalho, F. (eds) Selected Contributions in Data Analysis and Classification. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73560-1_2
Download citation
DOI: https://doi.org/10.1007/978-3-540-73560-1_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73558-8
Online ISBN: 978-3-540-73560-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)