Abstract
This chapter treats an econometric model as an experiment, or set of experiments, that are embedded within an economic model. It shows how to unpack the “experimental content” of an econometric analysis, and demonstrates how this concept and the modeling principles from the preceding chapters contribute to the development of econometric models. These ideas come to life in applications to orchestra auditions, aluminum recycling, the returns to schooling, economic development in The Gambia, fracking, and more.
This is a preview of subscription content, access via your institution.
Buying options
Notes
 1.
Except in New Zealand, where everyone starts school on their fifth birthday , and thus has the same number of years of required schooling .
 2.
We aren’t given the sampling variation in the birthquarter means of schooling or earnings , which would help us figure this out more definitively. But a careful look at Fig. 8.1a indicates it is small, a consequence of the large number of individual observations. While the nonseasonal wiggle is sizeable, little of it appears to be random. Thus most of this wiggle comes from unobserved birth quarterlevel influences, not sampling error.
 3.
When statelevel policies are analyzed using micro data, a twostep procedure can accomplish this goal. The first step regresses individuallevel outcomes on a set of dummies for each experimental unit (indexed with e) and individuallevel controls (indexed with i):
$$ {Y}_{i,e}={\delta}_e+\lambda {Z}_i+{\varepsilon}_{i,e} $$The second step regresses the estimates of these dummies on the treatment variable, T, and experimentalunit level controls , X:
$$ {\widehat{\delta}}_e=\alpha +\beta {T}_e+\gamma {X}_e+{\xi}_e $$  4.
Clearly, this isn’t precise or “properly done.” It isn’t trying to be. It’s a ballpark estimate that you can approximate in your head, and get to the same place that more ponderous formal calculations would get you.
 5.
This list includes only studies in economics that utilize nationwide data to test their hypotheses of interest. Some other studies use data on only a handful of states, and are less subject to the criticisms that follow (as is Feyrer et al.’s Table A13, a supplementary regression that includes state*year fixed effects). All of these papers were scouted in 2016, and those that were working papers at the time were followed up on later.
 6.
If you think it’s crazy to imagine that geology influences behavior, think again. For instance, countylevel “heat maps” of numerous social phenomena outline the shape of Appalachia.
 7.
The geological part is the land scarcity on both coasts, for different reasons. The political part is zoning and laws governing mortgage refinancing.
 8.
The notion that progressiveness influences the use of blind auditions is rejected by a probit model that relates the probability an orchestra adopts a screen to its proportion of female members, dropping orchestras from the sample in the year after they adopt a screen. Even two standard errors away from the coefficient estimate, the extent of reverse causation is small.This check is flimsy conceptually and econometrically. Conceptually, this proportion, a stock built up slowly over time, may poorly represent shortterm changes in progressiveness . Econometrically, the experimental power of this test is heavily concentrated in the only nonadopting orchestra in the sample, Cleveland, because it remains in the sample twice as long, and the dependent and independent variables are both trending time series. In a crude replication, described below, removing Cleveland from the sample or using a more appropriate estimator yielded ambivalent results. (To “replicate” the authors’ endogeneity test, I created a crudelyrealistic data set including 47 observations for one nonadopting orchestra and 24 for the other ten, which each adopt blind auditions in that 24^{th} year, putatively because of “progressiveness.” For all orchestras, the independent variable equals zero for the first thirteen years, and then grows at the same linear rate, with a touch of noise added, until the orchestra leaves the sample. The “tstatistic” on the probit coefficient was six times that on the coefficient in a moreappropriate hazard model, which has a low degree of statistical power ; removing the nonadopting orchestra from the sample, the probit coefficient becomes unidentified.)
 9.
This paper was published before clustered standard errors became common. Nonetheless, I hope the ensuing discussion convinces you that employing random effects dominates clustering in this situation, for the purposes of harmony , insight, and fidelity .
 10.
Assuming the data is sufficient to the task, or that any data insufficiencies can be finessed. Otherwise, these issues may have to be explored more opaquely, using formal tests.
References
Angrist J, Krueger A (1991) Does compulsory school attendance affect schooling and earnings? Q J Econ 106(4):979–1014
Angrist J, Pischke JS (2009) Mostly harmless econometrics: an Empiricist’s Companion. Princeton University Press, Princeton, NJ
Ashley R, Parmeter C (2015) Sensitivity analysis for inference in 2SLS/GMM estimation with possibly flawed instruments. Empir Econ 49(4):1153–1171
Bertrand M, Duflo E, Mullainathan S (2004) How much should we trust differencesindifferences estimates? Q J Econ 119(1):249–275
Bound J, Jaeger D, Baker R (1995) Problems with instrumental variables estimation when the correlation between the instruments and the endogeneous explanatory variable is weak. J Am Stat Assoc 90(430):443–450
Cabral M, Cullen MR (2016) Estimating the value of public insurance using complementary private insurance. (No. w22583). National Bureau of Economic Research
Card D, Krueger A (1994) Minimum wages and employment: a case study of the fast food industry in New Jersey and Pennsylvania. Am Econ Rev 84(4):772–793
Cunningham S, Lindo J, Myers C, Schlosser A (2018) How far is too far? New evidence on abortion clinic closures, access, and abortions. (No. w23366). National Bureau of Economic Research
Dickens W (1990) Error components in grouped data: is it ever worth weighting? Rev Econ Stat 72(2):328–333
Feyrer J, Mansur E, Sacerdote B (2017) Geographic dispersion of economic shocks: evidence from the fracking revolution. Am Econ Rev 107(4):1313–1334. Available as a 2015 working paper
Goldin C, Rouse C (2000) Orchestrating impartiality: the impact of blind auditions on female musicians. Am Econ Rev 90(4):715–741
Gruber J, Kim J, Mayzlin D (1999) Physician fees and procedure intensity: the case of cesarean delivery. J Health Econ 18(4):473–490
Hamermesh D (2000) The craft of labormetrics. Ind Labor Relat Rev 53(3):363–380
Jaimovich D (2013) Missing links, missing markets: internal exchanges, reciprocity and external connections in the economic networks of Gambian villages. (No. 2209075). Social Science Research Network
James A, Smith B (2017) There will be blood: crime rates in shalerich US counties. J Environ Econ Manag 84:125–152. Available as a 2014 working paper
Maniloff P, Mastromonaco R (2015) The local economic aspects of fracking. Manuscript, Colorado School of Mines and the University of Oregon. Working Paper. http://pages.uoregon.edu/ralphm/fracking_may_15.pdf
McCollum M, Upton G (2018) Local labor market shocks and residential mortgage payments: evidence from shale oil and gas booms. Resour Energy Econ 53:162–197
Moulton B (1990) An illustration of a pitfall in estimating the effects of aggregate variables on micro units. Rev Econ Stat 72(2):334–338
Munasib A, Rickman D (2015) Regional economic impacts of the shale gas and tight oil boom: a synthetic control analysis. Reg Sci Urban Econ 50:1–7
Paredes D, Komarek T, Loveridge S (2015) Income and employment effects of shale gas extraction windfalls: evidence from the Marcellus region. Energy Econ 47:112–120
Solon G, Haider S, Wooldridge J (2015) What are we weighting for? J Hum Resour 50(2):301–316
Author information
Authors and Affiliations
Food for Thought
Food for Thought

1.
The opening to this chapter addressed the use and misuse of weighting across crosssectional units of different “size,” such as states.

(a)
Set out an individuallevel regression with a statelevel policy variable and a statelevel random effect , as in Eq. (8.1). Work with it to show that, in a statelevel analysis, West Bengal should be weighted more than Kashmir, but not seven times more.

(b)
Chapter 3 addressed the same issue differently. Return to question #3 in that chapter and give it another try. Does our discussion of experimental content make it easier to answer?

(a)

2.
Figure 8.3 contains a somewhatrecent snapshot of the states that had adopted an important policy in the U.S . Its scale of implementation also seems “somewhat geologic.” (It is, in fact, extremely geologic. States that were covered by the Western Inland Sea–look it up–are far less likely to have adopted this policy.) What policy is this?

3.
The following questions all pertain to the chapter’s discussion of Gruber et al. (1999).

(a)
Compare my Table 8.1 with the original in Gruber et al. (1999). Note the differences. How do these differences reflect principles of effective description ?

(b)
The choice to examine the sum of squares of the changes in the fee differential is not inconsequential. Could comparing the magnitudes of these values, rather than their squared magnitudes, be justified? Does examining the weighted sum of squares focus on how California affects the coefficient estimates or the standard errors ?

(c)
The chapter claims that every “problem” with the experimental content of your analysis conforms to a violation of the classical OLS assumptions . If so, which key assumption is violated in Gruber et al.? Relate your answer to the techniques advocated in Bertrand et al. (2004).

(a)

4.
The following questions all pertain to the chapter’s discussion of Goldin and Rouse (2000).

(a)
Because orchestra auditions consists of several, sequential rounds, the most natural econometric model for the situation analyzed by Goldin and Rouse (2000) would be an ordered probit , were the data suitable to support it. Assume you possessed an index of auditioner “quality.” Then lay out a suitable ordered probit model to examine the effect of blind auditions on females’ probability of advancement, in which the effect of discrimination against females shifts the thresholds required for advancement to the next round.

(b)
One way to think about the system’s dynamics is to create a latent variable for “progressiveness ,” which increases at different rates within different orchestras , and which influences some of the coefficients and variables in Eq. (8.4). Incorporate this variable into this equation, and show how bias and serially correlated errors are likely to result.

(c)
I prefer the following specification to Eq. (8.4), though it is econometrically equivalent, because it better describes the actual process at work. How so?
P = α+βB+γF · (1 − B)+δX+ϵ

(a)

5.
These questions pertain to Fig. 8.2.

(a)
The residuals for each orchestra but one signify an econometric issue. Identify the issue associated with each orchestra.

(b)
In this figure, the errors for each orchestra signify a different econometric problem. Is that likely to happen in practice, or is it more likely that many orchestras would display similar problems?

(c)
In general terms, explain how the “secondgeneration model” discussed in the text could be used to conduct the final analysis at the level of the experimental unit, as advocated in this chapter. Then explain why would be necessary to do so, in order to get correct standard errors .

(d)
(Difficult.) Work out how to implement this approach, in order to get unbiased coefficient estimates and standard errors , under the assumption that the deterministic component of the model is correctly specified, and that there are no problems with serial correlation.

(a)

6.
The main specification in Card and Krueger’s (1994) twoperiod, twostate differenceindifference analysis is as follows:
ΔE = α+βX+γNJ+ε
where E is restaurantlevel employment, X are restaurantlevel controls, and NJ is a dummy for New Jersey, the minimumwageincreasing state. Show that if a random effect is included in this specification at the level of the experimental unit, the true standard error of \( \widehat{\gamma} \) is unidentified and its OLS standard error is biased downward.

7.
Traffic accidents can be considered independent, (statistically) rare events, so the number of fatal accidents in any given place and time has a Poisson distribution. This naturally lends itself to a count data model. However, a pure Poisson regression has the classic problem that the conditional variance of fatalities equals its mean, which does not hold up in practice.
One method of addressing this problem is to specify a generalized linear model:
F _{s,t}~Poisson(f _{s,t})
log(f _{s,t}) = α+βX _{s,t}+ε _{s,t}
where F _{s,t} is the observed number of fatal accidents in location s in period t, X contains the explanatory variables, and ε_{s,t} is a random effect . The second line of this equation predicts the latent variable f _{s,t}, which serves as the mean (and variance) of the Poisson distribution from which realized F _{s,t} is “drawn.”
An alternative method is to specify a Negative Binomial Model:
F _{s,t}~NegBin(f _{s,t}, σ)
log(f _{s,t}) = α+βX _{s,t}
where σ is the “overdispersion” parameter associated with the Negative Binomial distribution.

(a)
Which is the more natural model? Why?

(b)
Which model is more commonly used in economics? Why?

(a)

8.
Cunningham et al. (2018) examine how abortion clinic closures affect abortion rates in Texas , relating the change in counties’ abortion rates to the increase in the distance women must travel to have an abortion when a closer clinic closes (measured using five “increase in distance” categories). There are 254 counties in Texas and 42 abortion clinics in their sample, 18 of which closed during the interval a clinicclosing law took effect, leaving 11 of 16 metropolitan statistical areas without a clinic. There were nearly four million Texas pregnancies during the study period, approximately 10% of which ended in abortion.
Sketch out the experimental content of this study. Identify the experimental unit and the spatial and temporal scale of variation in the key independent variable. To the nearest power of ten, how many independent experiments does this study contain?
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Grant, D. (2018). Econometric Modeling. In: Methods of Economic Research. Springer Texts in Business and Economics. Springer, Cham. https://doi.org/10.1007/9783030017347_8
Download citation
DOI: https://doi.org/10.1007/9783030017347_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 9783030017330
Online ISBN: 9783030017347
eBook Packages: Economics and FinanceEconomics and Finance (R0)