Skip to main content

Linear Models and Regression Diagnostics

  • Chapter
  • First Online:
Book cover Political Analysis Using R

Part of the book series: Use R! ((USE R))

  • 12k Accesses

Abstract

The linear regression model estimated with ordinary least squares (OLS) is a workhorse model in Political Science.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Berkman and Plutzer’s data file, named BPchap7.dta, is available from the Dataverse linked on page vii or the chapter content linked on page 79. Remember that you may need to use the setwd command to point to where you have saved the data.

  2. 2.

    A theoretically attractive alternative to listwise deletion as a means of handling missing data is multiple imputation. See Little and Rubin (1987), Rubin (1987), and King et al. (2001) for more details.

  3. 3.

    See Brambor et al. (2006) for further details on interaction terms. Also, note that an equivalent specification of this model could be achieved by replacing phase1* senior_c and phase1*notest_p with the terms phase1+senior_c+ph_senior+ notest_p+ph_notest_p. We are simply introducing each of the terms separately in this way.

  4. 4.

    Users are reminded that for one-tailed tests, in which the user wishes to test that the partial coefficient specifically is either greater than or less than zero, the p-value will differ. If the sign of the coefficient matches the alternative hypothesis, then the corresponding p-value is half of what is reported. (Naturally, if the sign of the coefficient is opposite the sign of the alternative hypothesis, the data do not fit with the researcher’s hypothesis.) Additionally, researchers may want to test a hypothesis in which the null hypothesis is something other than zero: In this case, the user can construct the correct t-ratio using the reported estimate and standard error.

  5. 5.

    Researchers who write their documents with LaTeX can easily transfer the results of a linear model from R to a table using the xtable library. (HTML is also supported by xtable. ) On first use, install with: install.packages("xtable"). Once installed, simply entering library(xtable); xtable(mod.hours) would produce LaTeX-ready code for a table that is similar to Table 6.1. As another option for outputting results, see the rtf package about how to output results into Rich Text Format.

  6. 6.

    In fact, we also could conclude that the coefficient is greater than zero at the 95 % confidence level. For more on how confidence intervals can be useful for one-tailed tests as well, see Gujarati and Porter (2009, p. 115).

  7. 7.

    In other words, if we fail to reject the null hypothesis for a Jarque–Bera test, then we conclude that there is not significant evidence of non-normality. Note that this is different from concluding that we do have normality. However, this is the strongest conclusion we can draw with this test statistic.

  8. 8.

    A VIF of 10 means that 90 % of the variance in a predictor can be explained by the other predictors, which in most contexts can be regarded as a large degree of common variance. Unlike other diagnostic tests, though, this rule of thumb should not be regarded as a test statistic. Ultimately the researcher must draw a substantive conclusion from the results.

References

  • Berkman M, Plutzer E (2010) Evolution, creationism, and the battle to control America’s classrooms. Cambridge University Press, New York

    Book  Google Scholar 

  • Brambor T, Clark WR, Golder M (2006) Understanding interaction models: improving empirical analyses. Polit Anal 14(1):63–82

    Article  Google Scholar 

  • Gujarati DN, Porter DC (2009) Basic econometrics, 5th edn. McGraw-Hill/Irwin, New York

    Google Scholar 

  • Huber PJ (1967) The behavior of maximum likelihood estimates under nonstandard conditions. In: LeCam LM, Neyman J (eds) Proceedings of the 5th Berkeley symposium on mathematical statistics and probability, volume 1: statistics University of California Press, Berkeley, CA

    Google Scholar 

  • King G, Honaker J, Joseph A, Scheve K (2001) Analyzing incomplete political science data: an alternative algorithm for multiple imputation. Am Polit Sci Rev 95(1):49–69

    Google Scholar 

  • Little RJA, Rubin DB (1987) Statistical analysis with missing data, 2nd edn. Wiley, New York

    MATH  Google Scholar 

  • Owsiak AP (2013) Democratization and international border agreements. J Polit 75(3):717–729

    Article  Google Scholar 

  • Rubin DB (1987) Multiple imputation for nonresponse in surveys. Wiley, New York

    Book  MATH  Google Scholar 

  • White H (1980) A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica 48(4):817–838

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

6.1 Electronic Supplementary material

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Monogan, J.E. (2015). Linear Models and Regression Diagnostics. In: Political Analysis Using R. Use R!. Springer, Cham. https://doi.org/10.1007/978-3-319-23446-5_6

Download citation

Publish with us

Policies and ethics