Skip to main content

The Regression Models with Dummy Explanatory Variables

  • Chapter
  • First Online:
Applications of Regression Techniques

Abstract

Dummy Variables can be incorporated in regression models just as easily as quantitative variables. As a matter of fact, a regression model may contain regressors that are all exclusively dummy, or qualitative in nature. The results of such a model will be exactly same as the results found by Analysis of Variance (ANOVA) model. The regression model used to assess the statistical significance of the relationship between a quantitative regressand and (all) qualitative or dummy regressors is equivalent to a corresponding ANOVA model. For each qualitative regressor the number of dummy variables introduced must be one less than the no. of categories of that variable. If a qualitative variable has m categories, introduce only (m-1) dummy variables. The category for which no dummy variable is assigned is known as the base, benchmark, control, comparison, reference, or omitted category. And all comparisons are made in relation to the benchmark category. The intercept value represents the mean value of the benchmark category. The coefficients attached to the dummy variables are known as the differential intercept coefficients because they tell by how much the value of the intercept that receives the value of 1 differs from the intercept coefficient of the benchmark category.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 99.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Note: Three binary variables can be taken if the constant term is dropped from the regression equation

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Manoranjan Pal .

Appendix

Appendix

Linear Probability Model Versus Linear Discriminant Function

Suppose we have multivariate observations which come from one of two groups—group 1 and group 2, say. Linear discriminant function (LDF) is a linear function of the variables by which we can predict whether a new observation has come from group 1 or group 2. Linear probability model is interpreted as the probability that the event will occur. We assume that if the event occurs then it comes from group 1, otherwise from group 2.

Linear probability model (LPM) has a direct link with linear discriminant function (LDF).

Let us first see how we construct with linear discriminant function.

Let the linear function be

$${z} = \lambda_{0} + \lambda_{1} {x}_{1} + \lambda_{2} {x}_{2} + \ldots + \lambda_{k} {x}_{k} .$$

To get the best discrimination between the two groups, we would want to choose the λi values so that the ratio

$$\frac{{{\text{Between group variance of}}\,{z}}}{{{\text{Within group variance of}}\,{z}}}$$

is maximum. Fisher suggested that we define a dummy variable

  • y = (n2)/(n1 + n2) if the individual belongs to the first group,

  • and = (−n1)/(n1 + n2) if the individual belongs to the second group,

then \({\hat{\lambda }}_{i} = \hat{\beta }_{i} /\left\{ {{\text{RSS}}/\left( {{n}_{1} + {n}_{2} - 2} \right)} \right\}\), where RSS is the Residual Sum of Squares of the regression of y on x values and \(\hat{\beta }_{i}\) s are the coefficients.

The LPM is

  • y = 1 if the individual belongs to first group,

  • and = 0 if the individual belongs to second group.

This nearly amounts to adding (n1)/(n1 + n2) to each observation of y as defined by Fisher. Thus, only the estimate of the constant term changes.

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Pal, M., Bharati, P. (2019). The Regression Models with Dummy Explanatory Variables. In: Applications of Regression Techniques. Springer, Singapore. https://doi.org/10.1007/978-981-13-9314-3_8

Download citation

Publish with us

Policies and ethics