Abstract
(a) The concept of correlation. Covariance and the correlation coefficient; (b) Testing for correlation. Rank correlation tests; (c) Correlation test pitfalls. Least squares fitting. Fitting to a straight line; (d) Testing the fit. Curvilinear line fitting; (e) Arbitrary function fitting.
A large part of science concerns looking for, and then trying to understand, causal relations between observed quantities. This is much harder where random variables are concerned. For example, take a look at Fig. 9.1. Each data point represents a galaxy where both the mass of the “bulge” component of a galaxy, and the mass of the central supermassive black hole it contains, have been estimated. It looks like these things are connected, which could be important. But the points are rather scattered, and have large error bars. Are we just being fooled by a chance distribution of points drawn from some random distribution? Note also that the authors have drawn a line going through the data, hopefully representing the true relationship between black hole mass and bulge mass. But is the slope of the line right? How do we decide what the “best” slope is? And what if a straight line isn’t the right mathematical form? Can I test the prediction for my favourite theory?
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The simulations were all generated from bivariate Gaussan PDFs, but in the bottom row, we have made the ring shape by subtracting one Gaussian from another.
References
Bevington, P., Robinson, D.K.: Data Reduction and Error Analysis for the Physical Sciences, 3rd edn. McGraw-Hill Education, New York (2002)
Corder, G.W., Foreman, D.I.: Nonparametric Statistics: A Step-by-Step Approach, 2nd edn. Wiley, Hoboken (2014)
Magorrian, J., et al.: The demography of massive dark objects in galaxy centers. Astron. J. 115, 2285–2305 (1998)
Siegel, S., Castellan, N.J.: Nonparametric Statistics for the Behavioral Sciences, Revised 2nd edn. McGraw-Hill Publishing Company, New York (1988)
Stanton, J.M.: Galton, Pearson, and the Peas: a brief history of linear regression for statistics instructors. J. Stat. Educ. 9(3) (2001)
Wasserman, L.: All of Nonparametric Statistics, Revised Corrected 3 printing edn. McGraw-Hill Publishing Company, New York (2007)
Websites (all accessed March 2019):
Blog post by Rasmus Bååth on Bayesian correlation testing. http://www.sumsar.net/blog/2013/08/bayesian-estimation-of-correlation/
Amusing list of spurious correlations by Tyler Vyglen. http://tylervigen.com/spurious-correlations
Wikipedia page on multi-variate Gaussians. https://en.wikipedia.org/wiki/Multivariate_normal_distribution
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Lawrence, A. (2019). Inference with Two Variables: Correlation Testing and Line Fitting. In: Probability in Physics. Undergraduate Lecture Notes in Physics. Springer, Cham. https://doi.org/10.1007/978-3-030-04544-9_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-04544-9_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-04542-5
Online ISBN: 978-3-030-04544-9
eBook Packages: Physics and AstronomyPhysics and Astronomy (R0)