Introduction to Statistics and Data Visualisation

Shardt, Yuri A. W.

doi:10.1007/978-3-319-21509-9_1

Yuri A. W. Shardt²

2719 Accesses

Abstract

This chapter introduces the reader to the fundamentals of descriptive statistics and data visualisation. Descriptive statistics focus on the development of methods for describing a given data set. They can be divided into two main groups: measures of central tendency, which examine the average behaviour of the data set, and measures of dispersion, which examine the spread of the data set. Measures of central tendency, such as the mean, mode, and median, are introduced, while measures of dispersion considered include the range, standard deviation, variance, median absolute difference, and skew. Also, quantiles and outliers are introduced as ways to describe a data set. Data visualisation focuses on developing a set of rules for effectively displaying data visually. Common data visualisation methods such as bar charts, histograms, pie charts, line charts, time series plots, box-and-whisker plots, scatter plots, probability plots, tables, and sparkplots are explained with detailed examples and methods of construction. The different approaches are illustrated with suitable examples, including a comprehensive analysis of a data set from a friction factor experiment. By the end of this chapter, the reader should be able to apply the principles of data description and visualisation to meaningfully portray the key properties of a given data set.

Εἰκὸς γὰρ γίνεσθαι πολλὰ καὶ παρὰ τὸ εἰκός.

It is likely that unlikely things should happen.

Aristotle, Poetics, 1456a, 24

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
If the specific number of tied entries is known, then the data set can be referred to by that number, for example, bimodal for a data set with 2 modes or trimodal for three modes.

References

Daniel C, Wood FS (1980) Fitting equations to data, 2nd edn. Wiley, New York
Google Scholar
Davies L, Gather U (1993) The identification of multiple outliers. J Am Stat Assoc 88(423):782–792
Article Google Scholar
Gerhart PM, Gross RJ, Hochstein JI (1992) Fundamentals of fluid mechanics. Addison-Wesley Publication Co., Reading
Google Scholar
Hyndman RJ, Fan Y (1996) Sample quantiles in statistical packages. Am Stat 50(4):361–365
Google Scholar
Lin B, Recke B, Knudsen JK, Jørgensen SB (2007) A systematic approach for soft sensor development. Comput Chem Eng 31:419–425
Article CAS Google Scholar
Tufte ER (1997) Visual and statistical thinking: displays of evidence for making decisions. Graphics Press LLC., Cheshire
Google Scholar
Tufte ER (2001) The visual display of quantitative information. Graphics Press LLC., Cheshire
Google Scholar
Varberg DE (1963) The development of modern statistics. The Mathematics Teacher 56(4):252–257
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Automation and Complex Systems (AKS), University of Duisburg-Essen, Duisberg, North Rhine-Westphalia, Germany
Yuri A. W. Shardt

Authors

Yuri A. W. Shardt
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Shardt, Y.A.W. (2015). Introduction to Statistics and Data Visualisation. In: Statistics for Chemical and Process Engineers. Springer, Cham. https://doi.org/10.1007/978-3-319-21509-9_1

Download citation

DOI: https://doi.org/10.1007/978-3-319-21509-9_1
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-21508-2
Online ISBN: 978-3-319-21509-9
eBook Packages: Chemistry and Materials ScienceChemistry and Material Science (R0)

Publish with us

Policies and ethics