Abstract
Before addressing the issue of summarization and visualization at multidimensional data, this chapter looks at these problems on the simplest level possible: just one feature. This also provides us with a stock of useful concepts for further material. The concepts of histogram, central point and spread are presented. Two perspectives on the summaries are outlined: one is the classical probabilistic and the other of approximation, naturally extending into the data recovery approach to supply a decomposition of the data scatter in the explained and unexplained parts. A difference between categorical and quantitative features is defined through the operation of averaging. The quantitative features admit averaging whereas the categorical ones not not. This difference is somewhat blurred at the binary features representing individual categories. They can be represented by the so-called dummy variables that can be considered quantitative too. Contemporary approaches, nature inspired optimization and bootstrap validation, are explained on individual cases.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
To do this, one may start from all sets Q(k) being empty and repeatedly run a loop over \(k = 1:K\) in such a way that at each step, a random entity is drawn from the entity set (with no replacement!) and put into the current Q(k); the process halts when no entities remain out of Q(k).
References
Carpenter, J., Bithell, J.: Bootstrap confidence intervals: when, which, what? A practical guide for medical statisticians. Stat. Med. 19, 1141–1164 (2000)
Efron, B., Tibshirani, R.: An Introduction to Bootstrap. Chapman & Hall, Boca Raton, FL (1993)
Engelbrecht, A.P.: Computational Intelligence. Wiley, New York, NY (2002)
Lohninger, H.: Teach Me Data Analysis. Springer, Berlin-New York-Tokyo (1999)
Polyak, B.: Introduction to Optimization. Optimization Software, Los Angeles, CA, ISBN: 0911575146 (1987)
Zadeh, L.A.: The concept of a linguistic variable and its application to approximate reasoning I–II. Inf. Sci., 8, 199-249, 301-375 (1975)
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag London Limited
About this chapter
Cite this chapter
Mirkin, B. (2011). 1D Analysis: Summarization and Visualization of a Single Feature. In: Core Concepts in Data Analysis: Summarization, Correlation and Visualization. Undergraduate Topics in Computer Science. Springer, London. https://doi.org/10.1007/978-0-85729-287-2_2
Download citation
DOI: https://doi.org/10.1007/978-0-85729-287-2_2
Published:
Publisher Name: Springer, London
Print ISBN: 978-0-85729-286-5
Online ISBN: 978-0-85729-287-2
eBook Packages: Computer ScienceComputer Science (R0)