Advertisement

1D Analysis: Summarization and Visualization of a Single Feature

  • Boris Mirkin
Chapter
Part of the Undergraduate Topics in Computer Science book series (UTICS)

Abstract

Before addressing the issue of summarization and visualization at multidimensional data, this chapter looks at these problems on the simplest level possible: just one feature. This also provides us with a stock of useful concepts for further material. The concepts of histogram, central point and spread are presented. Two perspectives on the summaries are outlined: one is the classical probabilistic and the other of approximation, naturally extending into the data recovery approach to supply a decomposition of the data scatter in the explained and unexplained parts. A difference between categorical and quantitative features is defined through the operation of averaging. The quantitative features admit averaging whereas the categorical ones not not. This difference is somewhat blurred at the binary features representing individual categories. They can be represented by the so-called dummy variables that can be considered quantitative too. Contemporary approaches, nature inspired optimization and bootstrap validation, are explained on individual cases.

Keywords

Membership Function Steep Descent Gini Index Categorical Feature Admissible Solution 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. Carpenter, J., Bithell, J.: Bootstrap confidence intervals: when, which, what? A practical guide for medical statisticians. Stat. Med. 19, 1141–1164 (2000)CrossRefGoogle Scholar
  2. Efron, B., Tibshirani, R.: An Introduction to Bootstrap. Chapman & Hall, Boca Raton, FL (1993)MATHGoogle Scholar
  3. Engelbrecht, A.P.: Computational Intelligence. Wiley, New York, NY (2002)Google Scholar
  4. Lohninger, H.: Teach Me Data Analysis. Springer, Berlin-New York-Tokyo (1999)MATHGoogle Scholar
  5. Polyak, B.: Introduction to Optimization. Optimization Software, Los Angeles, CA, ISBN: 0911575146 (1987)Google Scholar
  6. Zadeh, L.A.: The concept of a linguistic variable and its application to approximate reasoning I–II. Inf. Sci., 8, 199-249, 301-375 (1975)MathSciNetMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag London Limited 2011

Authors and Affiliations

  • Boris Mirkin
    • 1
    • 2
  1. 1.Research University – Higher School of Economics, School of Applied Mathematics and InformaticsMoscowRussia
  2. 2.Department of Computer ScienceBirkbeck University of LondonLondonUK

Personalised recommendations