1D Analysis: Summarization and Visualization of a Single Feature
Before addressing the issue of summarization and visualization at multidimensional data, this chapter looks at these problems on the simplest level possible: just one feature. This also provides us with a stock of useful concepts for further material. The concepts of histogram, central point and spread are presented. Two perspectives on the summaries are outlined: one is the classical probabilistic and the other of approximation, naturally extending into the data recovery approach to supply a decomposition of the data scatter in the explained and unexplained parts. A difference between categorical and quantitative features is defined through the operation of averaging. The quantitative features admit averaging whereas the categorical ones not not. This difference is somewhat blurred at the binary features representing individual categories. They can be represented by the so-called dummy variables that can be considered quantitative too. Contemporary approaches, nature inspired optimization and bootstrap validation, are explained on individual cases.