1D Analysis: Summarization and Visualization of a Single Feature

Mirkin, Boris

doi:10.1007/978-0-85729-287-2_2

Boris Mirkin^2,3

Part of the book series: Undergraduate Topics in Computer Science ((UTICS))

3859 Accesses

Abstract

Before addressing the issue of summarization and visualization at multidimensional data, this chapter looks at these problems on the simplest level possible: just one feature. This also provides us with a stock of useful concepts for further material. The concepts of histogram, central point and spread are presented. Two perspectives on the summaries are outlined: one is the classical probabilistic and the other of approximation, naturally extending into the data recovery approach to supply a decomposition of the data scatter in the explained and unexplained parts. A difference between categorical and quantitative features is defined through the operation of averaging. The quantitative features admit averaging whereas the categorical ones not not. This difference is somewhat blurred at the binary features representing individual categories. They can be represented by the so-called dummy variables that can be considered quantitative too. Contemporary approaches, nature inspired optimization and bootstrap validation, are explained on individual cases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 29.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
To do this, one may start from all sets Q(k) being empty and repeatedly run a loop over \(k = 1:K\) in such a way that at each step, a random entity is drawn from the entity set (with no replacement!) and put into the current Q(k); the process halts when no entities remain out of Q(k).

References

Carpenter, J., Bithell, J.: Bootstrap confidence intervals: when, which, what? A practical guide for medical statisticians. Stat. Med. 19, 1141–1164 (2000)
Article Google Scholar
Efron, B., Tibshirani, R.: An Introduction to Bootstrap. Chapman & Hall, Boca Raton, FL (1993)
MATH Google Scholar
Engelbrecht, A.P.: Computational Intelligence. Wiley, New York, NY (2002)
Google Scholar
Lohninger, H.: Teach Me Data Analysis. Springer, Berlin-New York-Tokyo (1999)
MATH Google Scholar
Polyak, B.: Introduction to Optimization. Optimization Software, Los Angeles, CA, ISBN: 0911575146 (1987)
Google Scholar
Zadeh, L.A.: The concept of a linguistic variable and its application to approximate reasoning I–II. Inf. Sci., 8, 199-249, 301-375 (1975)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Research University – Higher School of Economics, School of Applied Mathematics and Informatics, 11 Pokrovsky Boulevard, Moscow, RF, Russia
Boris Mirkin
Department of Computer Science, Birkbeck University of London, Malet Street, London, UK
Boris Mirkin

Authors

Boris Mirkin
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Mirkin, B. (2011). 1D Analysis: Summarization and Visualization of a Single Feature. In: Core Concepts in Data Analysis: Summarization, Correlation and Visualization. Undergraduate Topics in Computer Science. Springer, London. https://doi.org/10.1007/978-0-85729-287-2_2

Download citation

DOI: https://doi.org/10.1007/978-0-85729-287-2_2
Published: 09 February 2011
Publisher Name: Springer, London
Print ISBN: 978-0-85729-286-5
Online ISBN: 978-0-85729-287-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics