Encyclopedia of Database Systems

2018 Edition
| Editors: Ling Liu, M. Tamer Özsu

Exploratory Data Analysis

  • Hans Hinterberger
Reference work entry
DOI: https://doi.org/10.1007/978-1-4614-8265-9_1384

Definition

Exploratory data analysis (EDA) is an approach to data analysis that employs a number of different techniques to:
  1. 1.

    Look at data to see what it seems to say.

     
  2. 2.

    Uncover underlying structures.

     
  3. 3.

    Isolate important variables.

     
  4. 4.

    Detect outliers and other anomalies.

     
  5. 5.

    Suggest suitable models for conventional statistics.

     

Key Points

The term “exploratory data analysis” was introduced by John W. Tukey who in [2] shows how simple graphical and quantitative techniques can be used to open-mindedly explore data.

Typical graphical techniques are:
  1. 1.

    Plotting the raw data (e.g., stem-and-leaf diagrams, histograms, scatter plots)

     
  2. 2.

    Plotting simple statistics (e.g., mean plots, box plots, residual plots)

     
  3. 3.

    Positioning (multiple) plots to amplify cognition

     
Typical quantitative techniques are:
  1. 1.

    Interval estimation

     
  2. 2.

    Measures of location or of scale

     
  3. 3.

    Shapes of distributions

     

Exploratory data analysis can help to improve the results of statistical hypothesis...

This is a preview of subscription content, log in to check access.

Recommended Reading

  1. 1.
    Berry MJA, Linoff GS. Mastering data mining. New York: Wiley; 2000.Google Scholar
  2. 2.
    Tukey JW. Exploratory data analysis. Reading: Addison Wesley; 1977.zbMATHGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Computer ScienceETH ZurichZurichSwitzerland

Section editors and affiliations

  • Hans Hinterberger
    • 1
  1. 1.Inst. of Scientific ComputingETH ZürichZurichSwitzerland