Skip to main content

Chapter 1: Statistical Models

  • Chapter
  • First Online:
Generalized Linear Models With Examples in R

Part of the book series: Springer Texts in Statistics ((STS))

Abstract

This chapter introduces the concept of a statistical model. One particular type of statistical model—the generalized linear model—is the focus of this book, and so we begin with an introduction to statistical models in general. This allows us to introduce the necessary language, notation, and other important issues. We first discuss conventions for describing data mathematically (Sect. 1.2). We then highlight the importance of plotting data (Sect. 1.3), and explain how to numerically code non-numerical variables (Sect. 1.4) so that they can be used in mathematical models. We then introduce the two components of a statistical model used for understanding data (Sect. 1.5): the systematic and random components. The class of regression models is then introduced (Sect. 1.6), which includes all models in this book. Model interpretation is then considered (Sect. 1.7), followed by comparing physical models and statistical models (Sect. 1.8) to highlight the similarities and differences. The purpose of a statistical model is then given (Sect. 1.9), followed by a description of the two criteria for evaluating statistical models: accuracy and parsimony (Sect. 1.10). The importance of understanding the limitations of statistical models is then addressed (Sect. 1.11), including the differences between observational and experimental data. The generalizability of models is then discussed (Sect. 1.12). Finally, we make some introductory comments about using r for statistical modelling (Sect. 1.13).

all models are approximations. Essentially, all models are wrong, but some are useful. However, the approximate nature of the model must always be borne in mind.

Box and Draper [2, p. 424]

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 119.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Agresti, A.: An Introduction to Categorical Data Analysis, second edn. Wiley-Interscience (2007)

    Google Scholar 

  2. Box, G.E.P., Draper, N.R.: Empirical Model-Building and Response Surfaces. Wiley, New York (1987)

    MATH  Google Scholar 

  3. Brockmann, H.J.: Satellite male groups in horseshoe crabs, limulus polyphemus. Ethology 102, 1–21 (1996)

    Article  Google Scholar 

  4. Dunn, P.K., Smyth, G.K.: GLMsData: Generalized linear model data sets (2017). URL https://CRAN.R-project.org/package=GLMsData. R package version 1.0.0

  5. Efron, B.: Double exponential families and their use in generalized linear regression. Journal of the American Statistical Association 81(395), 709–721 (1986)

    Article  MathSciNet  Google Scholar 

  6. Giauque, W.F., Wiebe, R.: The heat capacity of hydrogen bromide from \(15^{\circ }\) K. to its boiling point and its heat of vaporization. The entropy from spectroscopic data. Journal of the American Chemical Society 51(5), 1441–1449 (1929)

    Article  Google Scholar 

  7. Hand, D.J., Daly, F., Lunn, A.D., McConway, K.Y., Ostrowski, E.: A Handbook of Small Data Sets. Chapman and Hall, London (1996)

    MATH  Google Scholar 

  8. Joglekar, G., Scheunemyer, J.H., LaRiccia, V.: Lack-of-fit testing when replicates are not available. The American Statistician 43, 135–143 (1989)

    Google Scholar 

  9. Johnson, B., Courtney, D.M.: Tower building. Child Development 2(2), 161–162 (1931)

    Article  Google Scholar 

  10. Kahn, M.: An exhalent problem for teaching statistics. Journal of Statistical Education 13(2) (2005)

    Google Scholar 

  11. Maron, M.: Threshold effect of eucalypt density on an aggressive avian competitor. Biological Conservation 136, 100–107 (2007)

    Article  Google Scholar 

  12. Mazess, R.B., Peppler, W.W., Gibbons, M.: Total body composition by dualphoton (153Gd) absorptiometry. American Journal of Clinical Nutrition 40, 834–839 (1984)

    Article  Google Scholar 

  13. Myers, R.H., Montgomery, D.C., Vining, G.G.: Generalized Linear Models with Applications in Engineering and the Sciences. Wiley, Chichester (2002)

    MATH  Google Scholar 

  14. Nelson, W.: Applied Life Data Analysis. Wiley Series in Probability and Statistics. John Wiley Sons, New York (1982)

    Book  Google Scholar 

  15. Royston, P., Altman, D.G.: Regression using fractional polynomials of continuous covariates: Parsimonious parametric modelling. Journal of the Royal Statistical Society, Series C 43(3), 429–467 (1994)

    Google Scholar 

  16. Shacham, M., Brauner, N.: Minimizing the effects of collinearity in polynomial regression. Industrial and Engineering Chemical Research 36, 4405–4412 (1997)

    Article  Google Scholar 

  17. Singer, J.D., Willett, J.B.: Improving the teaching of applied statistics: Putting the data back into data analysis. The American Statistician 44(3), 223–230 (1990)

    Google Scholar 

  18. Smyth, G.K.: Australasian data and story library (Ozdasl) (2011). URL http://www.statsci.org/data

  19. Tager, I.B., Weiss, S.T., Muñoz, A., Rosner, B., Speizer, F.E.: Longitudinal study of the effects of maternal smoking on pulmonary function in children. New England Journal of Medicine 309(12), 699–703 (1983)

    Article  Google Scholar 

  20. Tager, I.B., Weiss, S.T., Rosner, B., Speizer, F.E.: Effect of parental cigarette smoking on the pulmonary function of children. American Journal of Epidemiology 110(1), 15–26 (1979)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media, LLC, part of Springer Nature

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Dunn, P.K., Smyth, G.K. (2018). Chapter 1: Statistical Models. In: Generalized Linear Models With Examples in R. Springer Texts in Statistics. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-0118-7_1

Download citation

Publish with us

Policies and ethics