Linear Models and Regression Diagnostics

Monogan, James E.

doi:10.1007/978-3-319-23446-5_6

James E. Monogan III⁵

Part of the book series: Use R! ((USE R))

12k Accesses

Abstract

The linear regression model estimated with ordinary least squares (OLS) is a workhorse model in Political Science.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Berkman and Plutzer’s data file, named BPchap7.dta, is available from the Dataverse linked on page vii or the chapter content linked on page 79. Remember that you may need to use the setwd command to point to where you have saved the data.
2.
A theoretically attractive alternative to listwise deletion as a means of handling missing data is multiple imputation. See Little and Rubin (1987), Rubin (1987), and King et al. (2001) for more details.
3.
See Brambor et al. (2006) for further details on interaction terms. Also, note that an equivalent specification of this model could be achieved by replacing phase1* senior_c and phase1*notest_p with the terms phase1+senior_c+ph_senior+ notest_p+ph_notest_p. We are simply introducing each of the terms separately in this way.
4.
Users are reminded that for one-tailed tests, in which the user wishes to test that the partial coefficient specifically is either greater than or less than zero, the p-value will differ. If the sign of the coefficient matches the alternative hypothesis, then the corresponding p-value is half of what is reported. (Naturally, if the sign of the coefficient is opposite the sign of the alternative hypothesis, the data do not fit with the researcher’s hypothesis.) Additionally, researchers may want to test a hypothesis in which the null hypothesis is something other than zero: In this case, the user can construct the correct t-ratio using the reported estimate and standard error.
5.
Researchers who write their documents with LaTeX can easily transfer the results of a linear model from R to a table using the xtable library. (HTML is also supported by xtable. ) On first use, install with: install.packages("xtable"). Once installed, simply entering library(xtable); xtable(mod.hours) would produce LaTeX-ready code for a table that is similar to Table 6.1. As another option for outputting results, see the rtf package about how to output results into Rich Text Format.
6.
In fact, we also could conclude that the coefficient is greater than zero at the 95 % confidence level. For more on how confidence intervals can be useful for one-tailed tests as well, see Gujarati and Porter (2009, p. 115).
7.
In other words, if we fail to reject the null hypothesis for a Jarque–Bera test, then we conclude that there is not significant evidence of non-normality. Note that this is different from concluding that we do have normality. However, this is the strongest conclusion we can draw with this test statistic.
8.
A VIF of 10 means that 90 % of the variance in a predictor can be explained by the other predictors, which in most contexts can be regarded as a large degree of common variance. Unlike other diagnostic tests, though, this rule of thumb should not be regarded as a test statistic. Ultimately the researcher must draw a substantive conclusion from the results.

References

Berkman M, Plutzer E (2010) Evolution, creationism, and the battle to control America’s classrooms. Cambridge University Press, New York
Book Google Scholar
Brambor T, Clark WR, Golder M (2006) Understanding interaction models: improving empirical analyses. Polit Anal 14(1):63–82
Article Google Scholar
Gujarati DN, Porter DC (2009) Basic econometrics, 5th edn. McGraw-Hill/Irwin, New York
Google Scholar
Huber PJ (1967) The behavior of maximum likelihood estimates under nonstandard conditions. In: LeCam LM, Neyman J (eds) Proceedings of the 5th Berkeley symposium on mathematical statistics and probability, volume 1: statistics University of California Press, Berkeley, CA
Google Scholar
King G, Honaker J, Joseph A, Scheve K (2001) Analyzing incomplete political science data: an alternative algorithm for multiple imputation. Am Polit Sci Rev 95(1):49–69
Google Scholar
Little RJA, Rubin DB (1987) Statistical analysis with missing data, 2nd edn. Wiley, New York
MATH Google Scholar
Owsiak AP (2013) Democratization and international border agreements. J Polit 75(3):717–729
Article Google Scholar
Rubin DB (1987) Multiple imputation for nonresponse in surveys. Wiley, New York
Book MATH Google Scholar
White H (1980) A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica 48(4):817–838
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Political Science, University of Georgia, Athens, GA, USA
James E. Monogan III

Authors

James E. Monogan III
View author publications
You can also search for this author in PubMed Google Scholar

6.1 Electronic Supplementary material

Dataverse (2,154 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Monogan, J.E. (2015). Linear Models and Regression Diagnostics. In: Political Analysis Using R. Use R!. Springer, Cham. https://doi.org/10.1007/978-3-319-23446-5_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-23446-5_6
Published: 15 December 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23445-8
Online ISBN: 978-3-319-23446-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics