Skip to main content

Linear Regression

  • Chapter
  • First Online:
Advanced Statistics for the Behavioral Sciences
  • 1501 Accesses

Abstract

In Chap. 2 we learned how to use the QR decomposition to solve an overdetermined system of linear equations. Because our emphasis was computational, we focused on learning mathematical operations rather than interpreting the values they produced. Statistical analyses involve more than solving mathematical problems, however, and in this chapter you will learn how to interpret the findings from a linear regression model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Portions of this chapter are condensed from Brown (2014). Readers interested in a more thorough presentation should buy several thousand copies of that book.

  2. 2.

    Small samples can create interpretive problems, and these problems will be noted when they arise. At the same time, the data are fake, so interpretive problems are not really much of a concern.

  3. 3.

    Research supports the association being described, but don’t forget I made these data up. Consequently, you should not infer anything about the strength of the association between education and life satisfaction from these graphs.

  4. 4.

    All of the assumptions of a linear model are met in the examples used in this chapter. Beginning in Chap. 6, we will analyze examples that have built-in violations and introduce a broader range of diagnostic measures.

  5. 5.

    Geometrically, the square root of the coefficient of determination equals the cosine of the angle formed from the vectors of two difference scores: \( \Big[\left(y-\overline{y}\right) \) and \( \left(\widehat{y}-\overline{y}\right)\Big] \). The angle of the two vectors using our data set is 54.3211. The cosine = .5832, and squaring the cosine produces our coefficient of determination = ~.34

  6. 6.

    The mean square residual is also commonly denoted MSe, where e stands for error, and I will use the two designations interchangeably unless a distinction is needed.

  7. 7.

    The \( \mathcal{R} \) code that accompanies this chapter includes a function for generating the ellipse. The operations it performs will be described more fully in Chaps. 4 and 5.

  8. 8.

    In some textbooks, the standard error of a single score is given as:

    $$ {se}_{\widehat{y}}=\sqrt{\left[1+\left\{{\mathbf{p}}^{\prime }{\left({\mathbf{X}}^{\prime}\mathbf{X}\right)}^{-1}\mathbf{p}\right\}\right]\ast {MS}_e\ } $$
  9. 9.

    Neither Q nor R is unique. Consequently the decomposition produces different values if we enter our predictors in a different order. However, QQ′ = H is unique, so no matter what order we use, we always end up with the same hat matrix and, therefore, the same fitted values and residuals.

  10. 10.

    There are several ways to calculate correlations, but one way is to standardize all variables and collect them into a single matrix Z. A correlation matrix is then found as (Z′Z)/(n − 1), and the significance of each value can be tested using \( t=\sqrt{\frac{r^2\left(n-2\right)}{\left(1-{r}^2\right)}} \)

  11. 11.

    Had our sample been larger, the contrast would be significant, so this is an instance where sample size matters. Nevertheless, the more general point being made here is valid: A predictor that makes a significant contribution to R2 is not necessarily a significantly stronger predictor than one that does not make a significant contribution to R2.

  12. 12.

    Our researcher only surveyed drivers, so zero is not meaningful in our sample.

  13. 13.

    Figure 3.8 has been smoothed for illustrative purposes. The topic of smoothing polynomial functions will be discussed in Chap. 9.

  14. 14.

    The Johnson-Neyman method provides useful information, but it can be hard to interpret all of the findings it produces. In our case, it is hard to imagine why monthly mileage would predict concern over gasoline prices for people who earn ~200k per year, but not for people who earn much less. Of course, the data are fabricated, so it’s all nonsense anyway, but the fact remains that, in many cases, the technique identifies significant values that defy interpretation.

  15. 15.

    By switching the values each group receives, different comparisons can be made within each coding scheme.

  16. 16.

    With fractional coding, the regression coefficients represent mean differences.

  17. 17.

    There are several ways to compute orthogonal contrast codes, and the scheme shown in Table 3.13 uses a form known as Helmert codes.

References

  • Brown, J. D. (2014). Linear models in matrix form: A hands-on approach for the behavioral sciences. New York: Springer.

    Book  Google Scholar 

  • Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2003). Applied multiple regression/correlation analysis in the behavioral sciences. Mahwah, NJ: Erlbaum.

    Google Scholar 

  • Draper, N. R., & Smith, H. (1998). Applied regression analysis (3rd ed.). New York: Wiley.

    Book  Google Scholar 

  • Fox, J. (1987). Effect displays for generalized linear models. Sociological Methodology, 17, 347–361.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Brown, J.D. (2018). Linear Regression. In: Advanced Statistics for the Behavioral Sciences. Springer, Cham. https://doi.org/10.1007/978-3-319-93549-2_3

Download citation

Publish with us

Policies and ethics