Skip to main content

1D Analysis: Summarization and Visualization of a Single Feature

  • Chapter
  • First Online:
Core Concepts in Data Analysis: Summarization, Correlation and Visualization

Part of the book series: Undergraduate Topics in Computer Science ((UTICS))

  • 3859 Accesses

Abstract

Before addressing the issue of summarization and visualization at multidimensional data, this chapter looks at these problems on the simplest level possible: just one feature. This also provides us with a stock of useful concepts for further material. The concepts of histogram, central point and spread are presented. Two perspectives on the summaries are outlined: one is the classical probabilistic and the other of approximation, naturally extending into the data recovery approach to supply a decomposition of the data scatter in the explained and unexplained parts. A difference between categorical and quantitative features is defined through the operation of averaging. The quantitative features admit averaging whereas the categorical ones not not. This difference is somewhat blurred at the binary features representing individual categories. They can be represented by the so-called dummy variables that can be considered quantitative too. Contemporary approaches, nature inspired optimization and bootstrap validation, are explained on individual cases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 29.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    To do this, one may start from all sets Q(k) being empty and repeatedly run a loop over \(k = 1:K\) in such a way that at each step, a random entity is drawn from the entity set (with no replacement!) and put into the current Q(k); the process halts when no entities remain out of Q(k).

References

  • Carpenter, J., Bithell, J.: Bootstrap confidence intervals: when, which, what? A practical guide for medical statisticians. Stat. Med. 19, 1141–1164 (2000)

    Article  Google Scholar 

  • Efron, B., Tibshirani, R.: An Introduction to Bootstrap. Chapman & Hall, Boca Raton, FL (1993)

    MATH  Google Scholar 

  • Engelbrecht, A.P.: Computational Intelligence. Wiley, New York, NY (2002)

    Google Scholar 

  • Lohninger, H.: Teach Me Data Analysis. Springer, Berlin-New York-Tokyo (1999)

    MATH  Google Scholar 

  • Polyak, B.: Introduction to Optimization. Optimization Software, Los Angeles, CA, ISBN: 0911575146 (1987)

    Google Scholar 

  • Zadeh, L.A.: The concept of a linguistic variable and its application to approximate reasoning I–II. Inf. Sci., 8, 199-249, 301-375 (1975)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag London Limited

About this chapter

Cite this chapter

Mirkin, B. (2011). 1D Analysis: Summarization and Visualization of a Single Feature. In: Core Concepts in Data Analysis: Summarization, Correlation and Visualization. Undergraduate Topics in Computer Science. Springer, London. https://doi.org/10.1007/978-0-85729-287-2_2

Download citation

  • DOI: https://doi.org/10.1007/978-0-85729-287-2_2

  • Published:

  • Publisher Name: Springer, London

  • Print ISBN: 978-0-85729-286-5

  • Online ISBN: 978-0-85729-287-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics