The Regression Models with Dummy Explanatory Variables

Pal, Manoranjan; Bharati, Premananda

doi:10.1007/978-981-13-9314-3_8

Manoranjan Pal³ &
Premananda Bharati⁴

1455 Accesses
1 Citations

Abstract

Dummy Variables can be incorporated in regression models just as easily as quantitative variables. As a matter of fact, a regression model may contain regressors that are all exclusively dummy, or qualitative in nature. The results of such a model will be exactly same as the results found by Analysis of Variance (ANOVA) model. The regression model used to assess the statistical significance of the relationship between a quantitative regressand and (all) qualitative or dummy regressors is equivalent to a corresponding ANOVA model. For each qualitative regressor the number of dummy variables introduced must be one less than the no. of categories of that variable. If a qualitative variable has m categories, introduce only (m-1) dummy variables. The category for which no dummy variable is assigned is known as the base, benchmark, control, comparison, reference, or omitted category. And all comparisons are made in relation to the benchmark category. The intercept value represents the mean value of the benchmark category. The coefficients attached to the dummy variables are known as the differential intercept coefficients because they tell by how much the value of the intercept that receives the value of 1 differs from the intercept coefficient of the benchmark category.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Hardcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Note: Three binary variables can be taken if the constant term is dropped from the regression equation

Author information

Authors and Affiliations

Economic Research Unit, Indian Statistical Institute, Kolkata, West Bengal, India
Manoranjan Pal
Biological Anthropology Unit, Indian Statistical Institute, Kolkata, West Bengal, India
Premananda Bharati

Authors

Manoranjan Pal
View author publications
You can also search for this author in PubMed Google Scholar
Premananda Bharati
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Manoranjan Pal .

Appendix

Linear Probability Model Versus Linear Discriminant Function

Suppose we have multivariate observations which come from one of two groups—group 1 and group 2, say. Linear discriminant function (LDF) is a linear function of the variables by which we can predict whether a new observation has come from group 1 or group 2. Linear probability model is interpreted as the probability that the event will occur. We assume that if the event occurs then it comes from group 1, otherwise from group 2.

Linear probability model (LPM) has a direct link with linear discriminant function (LDF).

Let us first see how we construct with linear discriminant function.

Let the linear function be

$${z} = \lambda_{0} + \lambda_{1} {x}_{1} + \lambda_{2} {x}_{2} + \ldots + \lambda_{k} {x}_{k} .$$

To get the best discrimination between the two groups, we would want to choose the λ_i values so that the ratio

$$\frac{{{\text{Between group variance of}}\,{z}}}{{{\text{Within group variance of}}\,{z}}}$$

is maximum. Fisher suggested that we define a dummy variable

y = (n₂)/(n₁ + n₂) if the individual belongs to the first group,
and = (−n₁)/(n₁ + n₂) if the individual belongs to the second group,

then ${\hat{\lambda }}_{i} = \hat{\beta }_{i} /\left\{ {{\text{RSS}}/\left( {{n}_{1} + {n}_{2} - 2} \right)} \right\}$, where RSS is the Residual Sum of Squares of the regression of y on x values and $\hat{\beta }_{i}$ s are the coefficients.

The LPM is

y = 1 if the individual belongs to first group,
and = 0 if the individual belongs to second group.

This nearly amounts to adding (n₁)/(n₁ + n₂) to each observation of y as defined by Fisher. Thus, only the estimate of the constant term changes.

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Pal, M., Bharati, P. (2019). The Regression Models with Dummy Explanatory Variables. In: Applications of Regression Techniques. Springer, Singapore. https://doi.org/10.1007/978-981-13-9314-3_8

Download citation

DOI: https://doi.org/10.1007/978-981-13-9314-3_8
Published: 19 July 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-9313-6
Online ISBN: 978-981-13-9314-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

The Regression Models with Dummy Explanatory Variables

Abstract

Access this chapter

Notes

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation