Skip to main content

Econometric Modeling

  • 763 Accesses

Part of the Springer Texts in Business and Economics book series (STBE)

Abstract

This chapter treats an econometric model as an experiment, or set of experiments, that are embedded within an economic model. It shows how to unpack the “experimental content” of an econometric analysis, and demonstrates how this concept and the modeling principles from the preceding chapters contribute to the development of econometric models. These ideas come to life in applications to orchestra auditions, aluminum recycling, the returns to schooling, economic development in The Gambia, fracking, and more.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-01734-7_8
  • Chapter length: 23 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   84.99
Price excludes VAT (USA)
  • ISBN: 978-3-030-01734-7
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Hardcover Book
USD   109.99
Price excludes VAT (USA)
Fig. 8.1
Fig. 8.2

Notes

  1. 1.

    Except in New Zealand, where everyone starts school on their fifth birthday , and thus has the same number of years of required schooling .

  2. 2.

    We aren’t given the sampling variation in the birth-quarter means of schooling or earnings , which would help us figure this out more definitively. But a careful look at Fig. 8.1a indicates it is small, a consequence of the large number of individual observations. While the non-seasonal wiggle is sizeable, little of it appears to be random. Thus most of this wiggle comes from unobserved birth -quarter-level influences, not sampling error.

  3. 3.

    When state-level policies are analyzed using micro data, a two-step procedure can accomplish this goal. The first step regresses individual-level outcomes on a set of dummies for each experimental unit (indexed with e) and individual-level controls (indexed with i):

    $$ {Y}_{i,e}={\delta}_e+\lambda {Z}_i+{\varepsilon}_{i,e} $$

    The second step regresses the estimates of these dummies on the treatment variable, T, and experimental-unit level controls , X:

    $$ {\widehat{\delta}}_e=\alpha +\beta {T}_e+\gamma {X}_e+{\xi}_e $$
  4. 4.

    Clearly, this isn’t precise or “properly done.” It isn’t trying to be. It’s a ballpark estimate that you can approximate in your head, and get to the same place that more ponderous formal calculations would get you.

  5. 5.

    This list includes only studies in economics that utilize nationwide data to test their hypotheses of interest. Some other studies use data on only a handful of states, and are less subject to the criticisms that follow (as is Feyrer et al.’s Table A13, a supplementary regression that includes state*year fixed effects). All of these papers were scouted in 2016, and those that were working papers at the time were followed up on later.

  6. 6.

    If you think it’s crazy to imagine that geology influences behavior, think again. For instance, county-level “heat maps” of numerous social phenomena outline the shape of Appalachia.

  7. 7.

    The geological part is the land scarcity on both coasts, for different reasons. The political part is zoning and laws governing mortgage refinancing.

  8. 8.

    The notion that progressiveness influences the use of blind auditions is rejected by a probit model that relates the probability an orchestra adopts a screen to its proportion of female members, dropping orchestras from the sample in the year after they adopt a screen. Even two standard errors away from the coefficient estimate, the extent of reverse causation is small.This check is flimsy conceptually and econometrically. Conceptually, this proportion, a stock built up slowly over time, may poorly represent short-term changes in progressiveness . Econometrically, the experimental power of this test is heavily concentrated in the only non-adopting orchestra in the sample, Cleveland, because it remains in the sample twice as long, and the dependent and independent variables are both trending time series. In a crude replication, described below, removing Cleveland from the sample or using a more appropriate estimator yielded ambivalent results. (To “replicate” the authors’ endogeneity test, I created a crudely-realistic data set including 47 observations for one non-adopting orchestra and 24 for the other ten, which each adopt blind auditions in that 24th year, putatively because of “progressiveness.” For all orchestras, the independent variable equals zero for the first thirteen years, and then grows at the same linear rate, with a touch of noise added, until the orchestra leaves the sample. The “t-statistic” on the probit coefficient was six times that on the coefficient in a more-appropriate hazard model, which has a low degree of statistical power ; removing the non-adopting orchestra from the sample, the probit coefficient becomes unidentified.)

  9. 9.

    This paper was published before clustered standard errors became common. Nonetheless, I hope the ensuing discussion convinces you that employing random effects dominates clustering in this situation, for the purposes of harmony , insight, and fidelity .

  10. 10.

    Assuming the data is sufficient to the task, or that any data insufficiencies can be finessed. Otherwise, these issues may have to be explored more opaquely, using formal tests.

References

  • Angrist J, Krueger A (1991) Does compulsory school attendance affect schooling and earnings? Q J Econ 106(4):979–1014

    CrossRef  Google Scholar 

  • Angrist J, Pischke J-S (2009) Mostly harmless econometrics: an Empiricist’s Companion. Princeton University Press, Princeton, NJ

    CrossRef  Google Scholar 

  • Ashley R, Parmeter C (2015) Sensitivity analysis for inference in 2SLS/GMM estimation with possibly flawed instruments. Empir Econ 49(4):1153–1171

    CrossRef  Google Scholar 

  • Bertrand M, Duflo E, Mullainathan S (2004) How much should we trust differences-in-differences estimates? Q J Econ 119(1):249–275

    CrossRef  Google Scholar 

  • Bound J, Jaeger D, Baker R (1995) Problems with instrumental variables estimation when the correlation between the instruments and the endogeneous explanatory variable is weak. J Am Stat Assoc 90(430):443–450

    Google Scholar 

  • Cabral M, Cullen MR (2016) Estimating the value of public insurance using complementary private insurance. (No. w22583). National Bureau of Economic Research

    Google Scholar 

  • Card D, Krueger A (1994) Minimum wages and employment: a case study of the fast food industry in New Jersey and Pennsylvania. Am Econ Rev 84(4):772–793

    Google Scholar 

  • Cunningham S, Lindo J, Myers C, Schlosser A (2018) How far is too far? New evidence on abortion clinic closures, access, and abortions. (No. w23366). National Bureau of Economic Research

    Google Scholar 

  • Dickens W (1990) Error components in grouped data: is it ever worth weighting? Rev Econ Stat 72(2):328–333

    CrossRef  Google Scholar 

  • Feyrer J, Mansur E, Sacerdote B (2017) Geographic dispersion of economic shocks: evidence from the fracking revolution. Am Econ Rev 107(4):1313–1334. Available as a 2015 working paper

    CrossRef  Google Scholar 

  • Goldin C, Rouse C (2000) Orchestrating impartiality: the impact of blind auditions on female musicians. Am Econ Rev 90(4):715–741

    CrossRef  Google Scholar 

  • Gruber J, Kim J, Mayzlin D (1999) Physician fees and procedure intensity: the case of cesarean delivery. J Health Econ 18(4):473–490

    CrossRef  Google Scholar 

  • Hamermesh D (2000) The craft of labormetrics. Ind Labor Relat Rev 53(3):363–380

    CrossRef  Google Scholar 

  • Jaimovich D (2013) Missing links, missing markets: internal exchanges, reciprocity and external connections in the economic networks of Gambian villages. (No. 2209075). Social Science Research Network

    Google Scholar 

  • James A, Smith B (2017) There will be blood: crime rates in shale-rich US counties. J Environ Econ Manag 84:125–152. Available as a 2014 working paper

    CrossRef  Google Scholar 

  • Maniloff P, Mastromonaco R (2015) The local economic aspects of fracking. Manuscript, Colorado School of Mines and the University of Oregon. Working Paper. http://pages.uoregon.edu/ralphm/fracking_may_15.pdf

  • McCollum M, Upton G (2018) Local labor market shocks and residential mortgage payments: evidence from shale oil and gas booms. Resour Energy Econ 53:162–197

    CrossRef  Google Scholar 

  • Moulton B (1990) An illustration of a pitfall in estimating the effects of aggregate variables on micro units. Rev Econ Stat 72(2):334–338

    CrossRef  Google Scholar 

  • Munasib A, Rickman D (2015) Regional economic impacts of the shale gas and tight oil boom: a synthetic control analysis. Reg Sci Urban Econ 50:1–7

    CrossRef  Google Scholar 

  • Paredes D, Komarek T, Loveridge S (2015) Income and employment effects of shale gas extraction windfalls: evidence from the Marcellus region. Energy Econ 47:112–120

    CrossRef  Google Scholar 

  • Solon G, Haider S, Wooldridge J (2015) What are we weighting for? J Hum Resour 50(2):301–316

    CrossRef  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Food for Thought

Food for Thought

  1. 1.

    The opening to this chapter addressed the use and misuse of weighting across cross-sectional units of different “size,” such as states.

    1. (a)

      Set out an individual-level regression with a state-level policy variable and a state-level random effect , as in Eq. (8.1). Work with it to show that, in a state-level analysis, West Bengal should be weighted more than Kashmir, but not seven times more.

    2. (b)

      Chapter 3 addressed the same issue differently. Return to question #3 in that chapter and give it another try. Does our discussion of experimental content make it easier to answer?

  2. 2.

    Figure 8.3 contains a somewhat-recent snapshot of the states that had adopted an important policy in the U.S . Its scale of implementation also seems “somewhat geologic.” (It is, in fact, extremely geologic. States that were covered by the Western Inland Sea–look it up–are far less likely to have adopted this policy.) What policy is this?

  3. 3.

    The following questions all pertain to the chapter’s discussion of Gruber et al. (1999).

    1. (a)

      Compare my Table 8.1 with the original in Gruber et al. (1999). Note the differences. How do these differences reflect principles of effective description ?

    2. (b)

      The choice to examine the sum of squares of the changes in the fee differential is not inconsequential. Could comparing the magnitudes of these values, rather than their squared magnitudes, be justified? Does examining the weighted sum of squares focus on how California affects the coefficient estimates or the standard errors ?

    3. (c)

      The chapter claims that every “problem” with the experimental content of your analysis conforms to a violation of the classical OLS assumptions . If so, which key assumption is violated in Gruber et al.? Relate your answer to the techniques advocated in Bertrand et al. (2004).

  4. 4.

    The following questions all pertain to the chapter’s discussion of Goldin and Rouse (2000).

    1. (a)

      Because orchestra auditions consists of several, sequential rounds, the most natural econometric model for the situation analyzed by Goldin and Rouse (2000) would be an ordered probit , were the data suitable to support it. Assume you possessed an index of auditioner “quality.” Then lay out a suitable ordered probit model to examine the effect of blind auditions on females’ probability of advancement, in which the effect of discrimination against females shifts the thresholds required for advancement to the next round.

    2. (b)

      One way to think about the system’s dynamics is to create a latent variable for “progressiveness ,” which increases at different rates within different orchestras , and which influences some of the coefficients and variables in Eq. (8.4). Incorporate this variable into this equation, and show how bias and serially correlated errors are likely to result.

    3. (c)

      I prefer the following specification to Eq. (8.4), though it is econometrically equivalent, because it better describes the actual process at work. How so?

      P = α+βB+γF · (1 − B)+δX+ϵ

  5. 5.

    These questions pertain to Fig. 8.2.

    1. (a)

      The residuals for each orchestra but one signify an econometric issue. Identify the issue associated with each orchestra.

    2. (b)

      In this figure, the errors for each orchestra signify a different econometric problem. Is that likely to happen in practice, or is it more likely that many orchestras would display similar problems?

    3. (c)

      In general terms, explain how the “second-generation model” discussed in the text could be used to conduct the final analysis at the level of the experimental unit, as advocated in this chapter. Then explain why would be necessary to do so, in order to get correct standard errors .

    4. (d)

      (Difficult.) Work out how to implement this approach, in order to get unbiased coefficient estimates and standard errors , under the assumption that the deterministic component of the model is correctly specified, and that there are no problems with serial correlation.

  6. 6.

    The main specification in Card and Krueger’s (1994) two-period, two-state difference-in-difference analysis is as follows:

    ΔE = α+βX+γNJ+ε

    where E is restaurant-level employment, X are restaurant-level controls, and NJ is a dummy for New Jersey, the minimum-wage-increasing state. Show that if a random effect is included in this specification at the level of the experimental unit, the true standard error of \( \widehat{\gamma} \) is unidentified and its OLS standard error is biased downward.

  7. 7.

    Traffic accidents can be considered independent, (statistically) rare events, so the number of fatal accidents in any given place and time has a Poisson distribution. This naturally lends itself to a count data model. However, a pure Poisson regression has the classic problem that the conditional variance of fatalities equals its mean, which does not hold up in practice.

    One method of addressing this problem is to specify a generalized linear model:

    F s,t~Poisson(f s,t)

    log(f s,t) = α+βX s,t+ε s,t

    where F s,t is the observed number of fatal accidents in location s in period t, X contains the explanatory variables, and εs,t is a random effect . The second line of this equation predicts the latent variable f s,t, which serves as the mean (and variance) of the Poisson distribution from which realized F s,t is “drawn.”

    An alternative method is to specify a Negative Binomial Model:

    F s,t~NegBin(f s,t, σ)

    log(f s,t) = α+βX s,t

    where σ is the “overdispersion” parameter associated with the Negative Binomial distribution.

    1. (a)

      Which is the more natural model? Why?

    2. (b)

      Which model is more commonly used in economics? Why?

  8. 8.

    Cunningham et al. (2018) examine how abortion clinic closures affect abortion rates in Texas , relating the change in counties’ abortion rates to the increase in the distance women must travel to have an abortion when a closer clinic closes (measured using five “increase in distance” categories). There are 254 counties in Texas and 42 abortion clinics in their sample, 18 of which closed during the interval a clinic-closing law took effect, leaving 11 of 16 metropolitan statistical areas without a clinic. There were nearly four million Texas pregnancies during the study period, approximately 10% of which ended in abortion.

    Sketch out the experimental content of this study. Identify the experimental unit and the spatial and temporal scale of variation in the key independent variable. To the nearest power of ten, how many independent experiments does this study contain?

Fig. 8.3
figure 3

The shaded states in this map had adopted a particular policy, as of a date in the not-too-recent-past. Can you tell what policy this is?

Rights and permissions

Reprints and Permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this chapter

Verify currency and authenticity via CrossMark

Cite this chapter

Grant, D. (2018). Econometric Modeling. In: Methods of Economic Research. Springer Texts in Business and Economics. Springer, Cham. https://doi.org/10.1007/978-3-030-01734-7_8

Download citation