2D Analysis: Correlation and Visualization of Two Features

  • Boris Mirkin
Part of the Undergraduate Topics in Computer Science book series (UTICS)


The chapter outlines several important characteristics of summarization and correlation between two features, and displays some of the properties of those. They are: linear regression and correlation coefficient for two quantitative variables; tabular regression, correlation ratio, decomposition of the quantitative feature scatter, and nearest neighbor classifier for the mixed scale case; and contingency table, Quetelet index, statistical independence, and Pearson’s chi-squared for two nominal variables; the latter is treated as a summary correlation measure, in contrast to the conventional view of it as a criterion of statistical independence. They all are applicable in the case of multidimensional data as well.


Contingency Table Statistical Independence Nominal Feature Student Data Correlation Ratio 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. Carpenter, J., Bithell, J.: Bootstrap confidence intervals: when, which, what? A practical guide for medical statisticians. Stat. Med. 19, 1141–1164 (2000)CrossRefGoogle Scholar
  2. Davison, A.C., Hinkley, D.V.: Bootstrap Methods and Their Application. Cambridge University Press, Cambridge (7th printing) (2005)Google Scholar
  3. Kendall, M.G., Stewart, A.: Advanced Statistics: Inference and Relationship, 3d edn. Griffin, London, ISBN: 0852642156 (1973)Google Scholar
  4. Lohninger, H.: Teach Me Data Analysis. Springer, Berlin-New York-Tokyo, ISBN 3-540-14743-8 (1999)MATHGoogle Scholar
  5. Mirkin, B.: Eleven ways to look at the chi-squared coefficient for contingency tables. Am. Stat. 55(2), 111–120 (2001)MathSciNetCrossRefGoogle Scholar
  6. Pearson, K.: On a criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen in random sampling. Phil. Mag. 50, 157–175 (1900)MATHGoogle Scholar

Copyright information

© Springer-Verlag London Limited 2011

Authors and Affiliations

  • Boris Mirkin
    • 1
    • 2
  1. 1.Research University – Higher School of Economics, School of Applied Mathematics and InformaticsMoscowRussia
  2. 2.Department of Computer ScienceBirkbeck University of LondonLondonUK

Personalised recommendations