Before addressing the issue of summarization and visualization at multidimensional data, this chapter looks at these problems on the simplest level possible: just one feature. This also provides us with a stock of useful concepts for further material. The concepts of histogram, central point and spread are presented. Two perspectives on the summaries are outlined: one is the classical probabilistic and the other of approximation, naturally extending into the data recovery approach to supply a decomposition of the data scatter in the explained and unexplained parts. A difference between categorical and quantitative features is defined through the operation of averaging. The quantitative features admit averaging whereas the categorical ones not not. This difference is somewhat blurred at the binary features representing individual categories. They can be represented by the so-called dummy variables that can be considered quantitative too. Contemporary approaches, nature inspired optimization and bootstrap validation, are explained on individual cases.
Membership Function Steep Descent Gini Index Categorical Feature Admissible Solution
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in to check access.
Carpenter, J., Bithell, J.: Bootstrap confidence intervals: when, which, what? A practical guide for medical statisticians. Stat. Med. 19, 1141–1164 (2000)CrossRefGoogle Scholar
Efron, B., Tibshirani, R.: An Introduction to Bootstrap. Chapman & Hall, Boca Raton, FL (1993)MATHGoogle Scholar
Engelbrecht, A.P.: Computational Intelligence. Wiley, New York, NY (2002)Google Scholar
Lohninger, H.: Teach Me Data Analysis. Springer, Berlin-New York-Tokyo (1999)MATHGoogle Scholar
Polyak, B.: Introduction to Optimization. Optimization Software, Los Angeles, CA, ISBN: 0911575146 (1987)Google Scholar