Skip to main content

Abstract

Regression analysis—often referred to simply as regression—is an important tool in statistical analysis. The concept first appeared in an 1877 study on sweet-pea seeds by Sir Francis Galton (1822–1911). He used the idea of regression again in a later study on the heights of fathers and sons. He discovered that sons of tall fathers are tall but somewhat shorter than their fathers, while sons of short fathers are short but somewhat taller than their fathers. In other words, body height tends toward the mean. Galton called this process a regression—literally, a step back or decline. We can perform a correlation to measure the association between the heights of sons and fathers. We can also infer the causal direction of the association. The height of sons depends on the height of fathers and not the other way around. Galton indicated causal direction by referring to the height of sons as the dependent variable and the height of fathers as the independent variable. But take heed: regression does not necessarily prove the causality of the association. The direction of effect must be derived theoretically before it can be empirically proven with regression. Sometimes the direction of causality cannot be determined, as, for example, between the ages of couples getting married. Does the age of the groom determine the age of the bride or vice versa? Or do the groom’s age and the bride’s age determine each other mutually? Sometimes the causality is obvious. So, for instance, blood pressure has no influence on age, but age has influence on blood pressure. Body height has an influence on weight, but the reverse association is unlikely (Swoboda 1971, p. 308).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 99.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    To use the Linest function for the dataset mail_order_business.xls, mark a field in the Excel sheet in which the regression results are to appear. With k regressors—in our case k = 2—this field must have five lines and k + 1 rows. Next choose the Linest command under FormulasInsert FunctionStatistical. Insert the dependent y variables (B2:B101) into the field Known_y’s and the x variables (C2:D101) into the field Known_x’s. If the regression contains a constant, the value one must be entered into the const field and the stats field. The command will NOT be activated by the enter button, but by the simultaneous activation of the buttons STRING+SHIFT+ENTER. In the first line, the coefficients β1 to βk are displayed. The last row of the first line contains the value of the constant α. The other lines display the remaining parameters, some of which we have yet to discuss. The second line shows the standard errors of the coefficients; the third line, the coefficient of determination (R2) and the standard error of the residuals; the fourth line, the f value and the degree of freedom. The last line contains the sum of squares of the regression (RSS) and residuals (ESS).

  2. 2.

    The Add-Ins Manager can be accessed via FileOptionsAdd-ins → Manage: Excel Add-ins → Go…

  3. 3.

    In Sect. 10.3 we calculated the regression coefficients β = (α = β0; β1; …; βk) as follows: \( \beta ={\left({X}^{\prime}X\right)}^{-1}{X}^{\prime}y \). The invertibility of (X´X)assumes that matrix X displays a full rank. In the case of perfect multicollinearity, at least two rows of the matrix are linearly dependent so (X´X) can no longer be inverted.

References

  • Hair, J. et al. (2006). Multivariate data analysis, 6th Edition. Upper Saddle River, NJ: Prentice Hall International.

    Google Scholar 

  • Swoboda, H. (1971). Exakte Geheimnisse: Knaurs Buch der modernen Statistik. Munich, Zurich: Knaur.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Cleff, T. (2019). Regression Analysis. In: Applied Statistics and Multivariate Data Analysis for Business and Economics. Springer, Cham. https://doi.org/10.1007/978-3-030-17767-6_10

Download citation

Publish with us

Policies and ethics