1 Introduction

As American students accumulate college loan debt (1.08 trillion dollars as of December 31, 2013, Source: NY Fed), there is a growing concern that expensive skills acquired in college may be underutilized in low-paying jobs. Existing studies estimate that around a third of American workers are “overeducated”—i.e., have more schooling than is necessary for their job.1 These would include, for example, a college graduate working as a cashier in a store. Estimated wage returns to this surplus schooling average 4.3%, or about half of the returns to required schooling. Thus, schooling mismatch appears to be an important source of the ex post heterogeneity in returns to schooling documented in the literature (see, e.g., Carneiro et al., 2003). At the aggregate level, overeducation could reflect skill mismatch and an inefficient allocation of workers to jobs.

What cross-sectional data miss is the possibility that overeducated workers may only be temporarily underemployed before switching to a job that requires their level of schooling. Further, low unobserved ability, compensating non-pecuniary job characteristics and career mobility considerations could rationalize apparent overeducation without the implication of a suboptimal schooling choice.2 In order to understand how much of a problem overeducation really is, it is crucial to go beyond the cross-sectional stylized facts and investigate longitudinal patterns.

This paper provides the first analysis of the career dynamics of overeducated US workers. Specifically, we use the National Longitudinal Survey of Youth 1979 (NLSY79), combined with the pooled 1989–1991 waves of the Current Population Survey (CPS), to examine how overeducated employment spells persist, exhibit duration dependence, and are associated with future wages. Most of our analysis will focus on the causes and consequences of overeducated employment among 2- and 4-year college graduates, who make up the bulk of overeducated workers.3

While the literature has paid comparatively little attention to the longitudinal dimension, analyzing transitions into and out of overeducated employment, together with their effects on wages, is key to disentangling the role played by labor market frictions versus other theories of overeducation.4 For example, if overeducation was due only to search frictions, one would expect this type of mismatch to be transitory and concentrated early in the career. Conversely, selection on ability, compensating wage differentials or career mobility motives would generate persistence in the overeducation patterns.5 The individual persistence and duration dependence of overeducated employment, together with the wage penalties associated with it, are also important for the design of unemployment insurance and training programs. For instance, encouraging early exit from unemployment may push more workers into overeducated work with potentially negative long-term effects on earnings.

The question of overeducation was first brought to the attention of economists and policymakers by Freeman (1976), who argued that excess supply of college graduates was causing the decline in the college wage premium observed in the USA during the 1970s. While the cross-sectional properties of overeducation are well-studied (see, e.g., Alba-Ramirez 1993; Hartog 2000; Kiker et al. 1997; Verdugo and Verdugo 1989), still little is known about the evolution of overeducation over the life cycle, although, as argued above, dynamics are of clear interest in this context. US evidence is particularly scarce.6

A notable exception is Rubb (2003) who provides evidence from the CPS that overeducation displays a substantial degree of persistence, with around 30% of the individuals overeducated in year t switching to a job which matches their level of education in year t+1. While duration dependence and dynamic selection effects imply that these transition rates are likely to decrease over the length of the spell, the CPS panels are too short to address this question adequately.

It is worth pointing out that while we borrow the wording “required level of education” from the existing literature, defining and measuring that concept is not a trivial task. In this paper, we use a statistical measure, in a similar spirit as, e.g., Verdugo and Verdugo (1989) and Kiker et al. (1997). Namely, we compute the mode of the distribution of schooling in the 1989–1991 CPS for each occupation in the 1980 3-digit Census Occupation classification. We also restrict the CPS sample used to compute the mode to individuals in the same birth cohorts as the NLSY79 respondents. The required levels of education are then defined as those within 15 percentage points of the schooling mode.7 The typical overeducated worker in our sample has 2 or 4 years of college education, but is working as a secretary or a cashier, say, among a majority of high school graduates. Relative to alternative approaches in the literature, and in particular those measuring the required levels of education with the General Educational Development (GED) scale provided by the Dictionary of Occupational Titles (see, e.g., Hartog 1980; Rumberger 1987), this approach is arguably more transparent and has the benefit of directly generating requirements in terms of years of education.

We document longitudinal patterns of overeducation for the NLSY79 cohort up to 12 years after labor market entry. Overeducation incidence within the cohort decreases as workers progress through their careers but remains sizeable 12 years after the first job. This suggests that, while frictions are likely to play a role, we need to appeal to other economic mechanisms to explain this long-term persistence. Overeducation is also a fairly persistent phenomenon at the individual level, with around 66% of overeducated workers remaining in overeducated employment after 1 year. We find that blacks and low cognitive ability workers (as measured by their AFQT scores) are not only more likely to be overeducated but also less likely to switch into matched jobs. That is, the longitudinal dimension magnifies the cross-sectional black-white and cognitive ability gaps.

The hazard rate out of overeducated work is also strongly decreasing in overeducation duration and drops by about 60% after 5 years. We estimate a mixed proportional hazard model (Elbers and Ridder 1982) of overeducated employment duration to investigate whether this decreasing hazard rate reflects selection on unobservables or true duration dependence. While composition effects based on observable characteristics explain away some of the duration dependence, further controlling for unobserved heterogeneity largely wipes it out. In other words, the duration of the overeducated employment spell does not have a significant impact on the probability to exit overeducation. Instead, we identify large unobservable differences in the hazard rate: while overeducation is found to be very persistent for 30% of the sample, the rest is much more likely to exit quickly, in keeping with a frictional view of overeducation. The latter pattern provides clear evidence that there is generally more to overeducation than selection on unobserved ability and preferences.

We then revisit the classical augmented wage regression used in the overeducation literature and document a robust and statistically—as well as economically—significant negative association between past overeducated employment and wages.8 This pattern holds after controlling for observed measures of cognitive and non-cognitive skills. We further show that past overeducation remains associated with lower wages after adding controls for unobserved ability that are constructed from the heterogeneity types identified in the duration analysis.

The results from our preferred specification show that, for a non-overeducated worker, past overeducation is associated with a sizeable wage penalty of between 2.6 and 4.2%, which persists over 4 years. This provides a likely candidate mechanism behind the negative wage effects of graduating during a recession (Kahn 2010; Oreopoulos et al. 2012; Altonji et al. 2016; Liu et al. 2016), since, consistent with the cyclical upgrading literature (Bils and McLaughlin 2001), overeducation is likely to be more frequent during recessions.

The remainder of the paper is organized as follows. Section 2 describes the sample used in the analysis, the construction of the required schooling measure and compares the cross-sectional properties of our measure of overeducation to the literature. Section 3 documents the longitudinal patterns in the incidence of overeducated employment along the career. Section 4 estimates a mixed proportional hazard model of overeducated employment duration allowing us to separate true duration dependence from dynamic selection on observed and unobserved worker attributes. Section 5 presents results pertaining to the effect of past overeducation on wages, and Section 6 concludes.

2 Data

Our main data source is the NLSY79 which is a nationally representative sample of 12,686 young men and women who were 14–22 years old when they were first surveyed in 1979.9 We pool the observations for the 6111 individuals that comprise the core civilian cross section of the NLSY79, from the 1982 to the 1994 rounds, which results in 79,443 person-year observations.10 Then we cut 5947 person-year observations with a level of education that was unreported or less than 12 years; 22,272 where the individual had not entered the labor market permanently and 6228 that were non-interviews. We define the date of (permanent) entry into the labor market as the first survey year where the individual (1) is employed in the civilian labor force, (2) works more than 26 weeks out of the year, (3) is not enrolled in school as of May 1 of the survey year, and (4) has reached her highest level of education over the sample period 1982–1994. After making these cuts, we are left with a total of 44,996 observations corresponding to 4895 distinct individuals.

The main variables of interest are the highest level of completed education, the occupation (measured using the 1980 3-digit Census code) and the hourly wage at the time of each interview.11 Besides these, the variables used in our analysis include age, minority status, gender and place of birth, cognitive and non-cognitive skill measures, geographical location and the corresponding local unemployment rate, family characteristics, a measure of hazards associated with the current occupation and employment history (see Table 1).

Table 1 Summary statistics: pooled cross section 1982–1994

2.1 Measuring required schooling

The NLSY79 does not have direct measures of required schooling in the job occupied by the respondent. In this paper, we use a statistical measure for the required level of education, in a similar spirit as, e.g., Verdugo and Verdugo (1989) and Kiker et al. (1997). Namely, for each given occupation in the 1980 3-digit Census Occupation classification, we compute the required level of education from the pooled monthly samples of the 1989–1991 waves of the CPS. Those years were chosen with two considerations in mind. First, they sit in the middle of the date range that we are analyzing. This minimizes the extent to which technological change might have altered the schooling requirements in some occupations. Second, the average unemployment rate (5.9% between 1989 and 1991, Source: Bureau of Labor Statistics) was low during these years. This reduces the likelihood that a bad labor market would push so many highly educated individuals into low schooling requirement occupations that the modal worker in those occupations would have more schooling than is necessary for their job.12 In order to obtain required schooling levels that are pertinent to the NLSY79 sample, we restrict the age range within each year of the CPS to that of the NLSY79 cohorts at that time (see Appendix for additional details on the CPS sample used in the analysis). Required schooling in a given occupation code is then defined as the mode of the distribution of the levels of education among the individuals working in that occupation.13

3-digit occupational codes correspond to a high level of disaggregation and constitute the finest description of occupations available for a long representative panel of US workers. Census occupational codes were defined, among other things, according to the skills involved in performing the job. For example, 3-digit codes distinguish among sales occupations between jobs that involve increasing degrees of knowledge and task complexity: from street vendors, to cashiers, sales workers (subdivided into eight different industry groups), sales representatives (six industry groups), sales engineers, and sales supervisors. However, there still might be unobserved heterogeneity in schooling requirements within some 3-digit occupations. One concern would be that a 3-digit occupation contains several occupations with different schooling requirements. With this in mind, for occupations such that the frequencies of two or more schooling levels are within 15 percentage points of each other, we choose to use a more conservative definition of over (and under)-education. Specifically, workers whose schooling attainments fall within the range defined by these schooling levels are classified as matched, while those with a higher (lower) level of education are defined as overeducated (undereducated). It is important to note that our results are robust to the choice of other cutoffs.14

In order to mitigate concerns with classification error on attained schooling, we collapse our years of education variable into four categories: 12–13 years, 14–15, 16–17, and over 18 years of completed education. This classification is natural since each category, simply referred to as 12, 14, 16, and 18 years of schooling in the rest of the paper, corresponds to high school graduates, 2-year and 4-year college graduates, and graduate school.15

A clear advantage of this method is that it generates requirements directly in terms of years of education. This is in contrast to a common approach in the literature which maps occupations into skills first, and then skills into years of education (using, for example, the GED scale). In practice the latter approach is problematic, in particular since there is no clear consensus on the mapping between the skill content of occupations, as measured by the GED scale, and years of schooling (Leuven and Oosterbeek 2011). On the other hand, one limitation of our measure of required schooling is that it is based on the distribution of schooling attainment among workers employed in a given occupation, which is an equilibrium outcome of labor supply and demand decisions. This may result in understating the fraction of workers who are overeducated. However, in order for our measure of required schooling level to be biased upward in any given occupation, the fraction of overeducated individuals would have to be higher than the fraction of matched individuals, which is possible but likely to be rare. Besides, the use of a conservative definition of mismatch for those occupations such that the frequencies of two or more schooling levels are within 15 percentage points of each other implies that only in cases where the fraction of overeducated individuals is significantly higher than the fraction of matched individuals would our measure overestimate the actual required schooling level.

2.2 Summary statistics

Table 1 presents summary statistics for the NLSY79 variables used in our analysis. Of the 44,996 observations in our final sample, 84.7% are employed, 10.6% are out of the labor force, and 4.8% are unemployed, 66.1% are high school graduates, 12.9% have 2 years and 14.8% have 4 years of college education, and 6.3% have some graduate school experience.

Figure 1 shows the distribution of years of overeducation, generated by subtracting an individual’s observed highest level of completed education from the level of education required by their occupation. More than half of all observations have a perfect match of observed and required education, while progressively smaller fractions exhibit 1 or more years of education level mismatch. Comparing the two panels shows that collapsing schooling attainment into four categories preserves the shape of the distribution. It also mitigates the concern that small errors in the measure of years of schooling attainment will generate overeducation status misclassification.

Fig. 1
figure 1

Distribution of overeducation

In Table 2, we further break down overeducation status by our categorical measure of attained schooling. It is apparent that overeducation is mechanically absent from the lowest schooling level (12 years of education) while undereducation is absent from the highest schooling level (18 years of education). Consistent with the existence of a relatively small number of jobs requiring 14 or 18 years of education, a very large fraction of the observations corresponding to those schooling levels exhibit education level mismatch. Although a larger share of jobs require 16 years of education, it is interesting to note that 37.4% of college graduates are overeducated, typically working in a job requiring 12 years of education. Table 3 lists the 10 occupations accounting for the most number of observations in overeducated employment. Secretaries and sales workers account for the largest numbers of overeducated workers. For most of the occupations in the table, the modal worker has 12 years of schooling, and the typical overeducated worker has 2 or 4 years of college education. One notable exception is teachers, which typically have 16 years of schooling in the CPS.

Table 2 Overeducation status proportions by education
Table 3 Ten most frequent occupations among overeducated workers

2.3 Cross-sectional properties of our measure of overeducation

Before moving on to the analysis of the career dynamics of overeducated workers, we document the properties of our measure of overeducation in the cross section and find that they are in line with existing studies that use different data set and overeducation measures. We report in Table 4 the estimation results from a Probit model, which allows the probability of overeducation to depend on a set of socio-demographic characteristics, ability measures, family characteristics, a measure of hazards associated with the current occupation and employment history.16 We stratify the regression by schooling level, consistent with our focus throughout the paper on overeducation as a labor market, rather than educational, phenomenon.

Table 4 Probit model of overeducation status

AFQT scores exhibit a negative and significant relationship with overeducation at all schooling attainments. This negative relationship between cognitive ability and the likelihood of overeducation, which is in line with prior findings in the literature (see, e.g., Allen and Van der Velden 2001; Chevalier and Lindley 2009), supports the idea that, conditional on working in any given occupation, individuals with relatively low ability tend to have acquired more skills through schooling than their higher ability peers.

Also in accordance with existing studies, females are about 5 to 13 percentage points more likely to be overeducated than males. Women may place more value on non-pecuniary characteristics associated with low-requirement jobs, such as flexibility in hours worked or their proximity to the family house, making it easier to combine work and home production activities. Alternatively, this could reflect discrimination on the part of employers. At any rate, given the existence of a substantial and persistent wage penalty of being overeducated (discussed in Section 5), this result implies that overeducation is an important aspect of the gender wage gap, which is absent from most of the literature on this question.17

Noteworthy, there is also a strong positive correlation between minority status and overeducation at the 14 years of schooling level. Blacks and Hispanics are 15.8 percentage points and 12.1 percentage points, respectively, likelier than whites to be overeducated among that group. Among college graduates and above, the relationship becomes insignificant, which is possibly a result of a strong selection into college based on unobservable skills that are negatively correlated with overeducation.

Lastly, the evidence regarding the theory that overeducated workers accept lower wages in exchange for better non-pecuniary job characteristics is mixed. Workers in hazardous jobs are actually more likely to be overeducated at all three levels of schooling. On the other hand, overeducated workers are more likely to hold several jobs (except for those with 18 or more years of schooling), which possibly reflects a higher flexibility for these overeducated jobs. These results complement previous studies that have interpreted a negative correlation between overeducation and job satisfaction as evidence against the compensating wage differential model of overeducation (see, e.g., Hersch 1991; Korpi and Tåhlin 2009).

3 Longitudinal patterns in overeducation

In this section we document patterns in overeducation in the longitudinal dimension, including (i) the aggregate incidence of overeducated employment over the career, (ii) individual transitions into and out of overeducation, and (iii) the hazard rates out of overeducation.

Figure 2 displays the fraction of our sample that is overeducated over the first 12 years of work for the NLSY79 cohort, for workers with at least some college. Overall, the incidence of overeducation decreases by about 12 percentage points, from 62.3 to 50.4%, over the first 12 years of respondents’ careers.18 While the decline in aggregate overeducation rates as the career progresses does suggest that overeducation is in part frictional, the most striking feature of this graph is that the incidence of overeducation remains very high more than 10 years after labor market entry. Overall, this is a clear indication that overeducation is a persistent phenomenon.

Fig. 2
figure 2

Composition of overeducation (fraction of respondents)

In Fig. 3, we further disaggregate this graph along several observable characteristics. Blacks do not exhibit the same reduction in overeducation as whites (panels 1 and 2). Similarly, panels 3 and 4 show that overeducation among females decreases much less than among males.

Fig. 3
figure 3

Composition of overeducation (14 years of education or more)

Lastly, individuals at higher AFQT quartiles see a larger decline in overeducation incidence than those at lower quartiles (panels 5 through 8).

Overall, these results suggest that overeducated black, female, and low-AFQT workers are less likely to receive and/or accept offers from matched jobs. These dynamic patterns therefore accentuate their already higher propensity to be overeducated.

Individual patterns point to a similar story. Table 5 presents the fractions of individuals who are non-employed (defined as unemployed or out of the labor force), undereducated, overeducated, or matched at interview time, conditional on the status reported during the interview 1 year before, for workers with at least some postsecondary education. Overeducation is also persistent at the individual level, with 66.0% of workers remaining overeducated after 1 year. This fraction is 20 points higher than the unconditional overeducation rate (46.3%). By comparison, non-employed individuals have a smaller 49.0% chance of remaining in that state.19

Table 5 Transition matrix

Transition rates are further broken down by gender and race in Tables 6 and 7. Men are slightly more likely than women to remain in an overeducated job but also more likely to transition into matched jobs. Differences by race are sizeable: (i) overeducation is more persistent among blacks relative to whites, (ii) overeducated blacks are much less likely than whites to transition to a matched job, and (iii) these matched spells are less persistent for blacks than for whites. Finally, as illustrated in Table 8, the persistence of overeducation decreases monotonically with AFQT scores. All of these patterns in individual transitions confirm that the aggregate persistence described earlier does not result from cancelling flows in and out of overeducation.

Table 6 Transition matrix by gender
Table 7 Transition matrix by race
Table 8 Transition matrix by AFQT quartile

Table 5 also shows that transitions into overeducation are equally likely among workers who were undereducated, matched, or non-employed in the previous year.20 Among males, however, non-employed workers are much likelier to transition into overeducation than matched or undereducated workers (Table 6). Across all categories of workers, but especially for black and low-AFQT workers, the non-employed are more likely to transition into overeducation than into matched jobs (Tables 7 and 8). In addition, transitions into matched jobs are more common among the overeducated than the non-employed. Taken together, these patterns suggest that for some workers overeducation is a pathway from non-employment into matched employment.

Finally, the NLSY79 data allows us to go beyond annual transitions, and report the hazard rates out of overeducated employment as a function of the duration of overeducation (see Fig. 4).21 After 3 years being overeducated, the probability of exiting that state, defined as starting a new matched or undereducated employment spell, drops from 39 to only 20%. This number further drops to 15 and 10% after 5 and 10 years, respectively. Overall, while this pattern is consistent with a negative duration dependence associated with overeducated employment, it could also result from compositional effects (permanent heterogeneity correlated with the hazard rate out of overeducated employment). We attempt to tell these two effects apart in the next section.

Fig. 4
figure 4

Hazard out of overeducation

4 Duration dependence versus dynamic selection

The results discussed so far provide some suggestive evidence of duration dependence in overeducation status, with a strongly decreasing hazard rate out of overeducation. However, in order to establish the role played by true, rather than spurious, duration dependence, we need to control for dynamic selection on worker attributes.

Specifically, we assume that the duration of the first spell of overeducated employment is determined by a mixed proportional hazard model, where the baseline duration follows a Weibull distribution. For this exercise we adopt the following definition of “overeducation spell”. We consider that an overeducated individual has exited their first overeducation spell when they become employed in an occupation that matches their schooling level, or when they become under-educated.22 By definition, this model is estimated on the (1648) individuals who have at least one overeducated spell, which mechanically excludes individuals with only 12 years of schooling.23 While using a parametric specification allows us to get more precise estimates, it is important to note that the mixed proportional hazard model is nonparametrically identified from single-spell data only (Elbers and Ridder 1982). The probability distribution function (pdf.) and cumulative distribution function (cdf.) of the duration of the overeducation spell, conditional on the set of observed individual characteristics x i and the unobserved heterogeneity ν i , are respectively given by:

$$\begin{array}{*{20}l} f\left(t|\mathbf{x}_{i},\nu_{i},\alpha, \theta\right) &= \exp\left(\mathbf{x}_{i} \theta \right) \alpha t^{\alpha-1}\nu_{i} \exp \left[ -\exp \left(\mathbf{x}_{i} \theta \right) t^{\alpha} \nu_{i} \right], \end{array} $$
(1)
$$\begin{array}{*{20}l} F\left(t|\mathbf{x}_{i},\nu_{i},\alpha, \theta\right) &= 1 - \exp \left[ -\exp \left(\mathbf{x}_{i} \theta \right) t^{\alpha}\nu_{i} \right]. \end{array} $$
(2)

Following Heckman and Singer (1984), we assume that the unobserved heterogeneity follows a discrete distribution with R points of support. The parameters (α,θ) and the unobserved heterogeneity distribution are then estimated by maximizing the log-likelihood of the data, which is obtained by integrating out the unobserved types:

$$\begin{array}{@{}rcl@{}} \ell=\sum_{i=1}^{N} \left[ {d}_{i}\log f \left({t}_{i} | \mathbf{x}_{i}, \alpha, \theta \right) + \left(1-{d}_{i} \right) \log \left[ 1-F\left({T}_{i} | \mathbf{x}_{i}, \alpha, \theta \right) \right] \right] \end{array} $$

where N is the number of individuals in the sample with at least one overeducated spell, d i =1 if individual i leaves the overeducated state before the end of the survey (0 otherwise), t i is the duration of the first overeducation spell (observed if d i =1) and T i is the length of time to the end of the survey. The pdf. of the overeducation spell duration is given by f(t|x i ,α,θ)=E ν (f(t|x i ,ν i ,α,θ)) and the cdf. is given by F(t|x i ,α,θ)=E ν (F(t|x i ,ν i ,α,θ)), where E ν (.) denotes the expectation operator with respect to the distribution of ν. Throughout our analysis, we consider that an overeducation spell ends when the individual starts a new employment spell in an occupation which does not require less than his level of schooling.24 It is interesting to compare the estimates for R=1 (i.e., without unobserved heterogeneity) and R=2 (i.e., with two unobserved heterogeneity types).25 Columns 2, 4, and 6 of Table 9 report the estimation results corresponding to the case without unobserved heterogeneity with alternative sets of individual controls. The estimated α is well below one (between 0.77 and 0.84 depending on the specification), which means that the hazard out of overeducation is strongly decreasing in the duration of overeducation even after controlling for observed heterogeneity. In other words, while part of this negative duration dependence is attributable to selection on observables, as shown by the increase in the estimated α parameter going from specification (3) to (1), the exit rate is still declining with the duration of the overeducation spell after controlling for an extensive set of observed characteristics.

Table 9 Duration models (first overeducation spell)

By contrast, once we allow for unobserved heterogeneity, we obtain values for α that are very close to one, signifying the absence of true duration dependence (Table 9, columns 1, 3, and 5). In other words, the duration of the first overeducated employment spell does not seem to have a significant impact on the probability to exit overeducation. The estimation results point to the existence of two groups of individuals with markedly different dynamics. Those two groups are identified from the variation in the duration of the first overeducation spell, conditional on observed characteristics. The first group has a low hazard (type 1, 29.1% of the ever-overeducated in the sample) while the second group is much more likely to exit overeducation quickly (type 2, 70.9%). As type 2 s exit the pool of overeducated individuals, the probability that a random individual exits overeducation declines, as she is more and more likely to be a low-hazard, type 1 individual.

The ratio between the two unobserved heterogeneity parameters is \(\frac {V_{1}}{V_{2}}=0.11\), which implies that type 2 s exit overeducation almost 10 times as fast as type 1 s. Interestingly, the stark difference in hazard rates between the two unobserved groups suggests that overeducation follows different mechanisms in each case. The high exit rate in group 2 is consistent with a frictional view of overeducation. For the remaining, low-hazard third of the sample, it could be that their aptitude (after controlling for AFQT) is not sufficient for jobs that match their level of formal schooling. Alternatively, they may have preferences for non-pecuniary, unobserved job characteristics found in some of the jobs that require less schooling, thus translating into highly persistent overeducation. In the next section, we explore these ideas further as we examine whether the two types differ in the wage patterns they exhibit.

The coefficients on observable characteristics allow us to complement and refine the analysis of the raw transition dynamics presented in Section 3. The negative coefficient on blacks in all specifications of the duration model implies that the lower exit rates out of overeducation for that group are robust to controlling for observable and unobservable heterogeneity. Similarly, the positive effect of AFQT scores on overeducation exit also carries over from the raw transitions into the mixed proportional hazard model. The coefficient for women is also significantly negative when the full set of observable characteristics is included in the proportional hazard model. However, once unobserved heterogeneity is taken into account, being female no longer has a statistically significant relationship with the probability of exiting overeducation. This runs against the notion that discrimination would keep women in jobs where they are overeducated.

Lastly, an interesting finding is that time spent unemployed in the past is associated with lower propensities to exit overeducation. This result complements a previous finding in the literature that longer time spent unemployed reduces the probability of finding a job (see, e.g., Kroft et al. 2013). It also suggests that higher overeducation persistence is one of the mechanisms through which past unemployment affects future wages (see, e.g., Schmieder et al. 2016).

5 Dynamic effects of overeducation on wages

While the regressions found in the literature focus on the cross-sectional correlation between current overeducation and wages, we examine whether initial overeducation is also associated with lower wages later in the career. Figure 5 describes the median hourly wage among workers with 14, 16, and 18 years of education as they progress through their career, conditional on their overeducation status. The striking result here is that the negative association between wages and overeducation at the start of the career appears very persistent over time for the first two schooling categories (14 and 16 years). In the following sections, we provide additional evidence of these effects by showing that the negative association between past overeducated employment and wages still holds after controlling for observed heterogeneity (including AFQT scores and measures of non-cognitive skills), and for the unobserved heterogeneity types identified in the duration analysis.

Fig. 5
figure 5

Path of median hourly wage

5.1 Augmented wage regressions

The impact of overeducation on wages has been measured in the literature by applying OLS to the following log-wage equation introduced by Duncan and Hoffman (1981):

$$\begin{array}{@{}rcl@{}} \log w_{it} = \alpha^{r} S_{it}^{r} + \alpha^{o}S_{it}^{o} + \alpha^{u}S_{it}^{u} + X_{it}'\beta + \varepsilon_{it} \end{array} $$

where, for any given individual i in year t, \(S_{it}^{r}\), \(S_{it}^{o}\) and \(S_{it}^{u}\) denote respectively the number of years of required schooling, years of overeducation (years of schooling above the required level), and years of undereducation (years of schooling below the required level), X it a vector of controls (including ability measures, socio-demographic background characteristics, labor market experience, and experience squared) and ε it an idiosyncratic productivity shock. This model, which nests the standard Mincerian wage regression (α r=α o=−α u), allows for the estimation of separate wage returns to the (i) required years of schooling, (ii) years of overeducation, and (iii) years of undereducation.

Table 10, panel 1, presents the pooled OLS parameter estimates for this model using our data.26

Table 10 Augmented log-wage regressions

Unlike for the rest of our analysis, individuals with 12 years of schooling are part of the estimation sample. Recall also that we restrict the sample to those individuals who have entered the labor market at some point between 1982 and 1994.27 While none of these individuals are overeducated, including them helps identify the returns to the required years of schooling, years of overeducation, and years of undereducation. The results from our sample are broadly consistent with prior existing studies. Namely, we see a return of 9.5% for each additional year of education, less than half the rate of return (3.5%) for additional years of education over the required level, and substantial wage penalties for years of undereducation (−6.5% for each additional year of undereducation).

These estimates also reveal a number of expected results. In particular, we see a gender wage gap of 15.3%, and a significant compensating wage differential for hazardous occupations. We also find significant wage premia relative to the North Central region of 9.6 and 6.9% for Northeast and West, respectively, consistent with higher average costs of living in these regions. Lastly, measures for labor market experience all have the expected sign. Those with greater occupation-specific tenure, and total labor market experience have higher wages. Conversely, those with frequent and/or long unemployment spells (as measured by total unemployment experience in weeks), and those frequently switching jobs during the year have lower wage rates, all else equal.

Following up on the dynamic patterns discussed at the beginning of the section, we report in panel 2 of Table 10 the estimation results from a log-wage regression which is further augmented with four lags in overeducation status. We also include interactions between lagged overeducation status and an indicator for the fact that the overeducation spell is ongoing. This disentangles the effect of past completed overeducation spells on wages when an individual is currently matched or undereducated, from the effect of the duration of an ongoing overeducated spell on current wages if an individual is still overeducated. Notably, we find wage penalties of 2.1–4.2% per year associated with up to four lags of overeducation. Those estimates, which show that having been overeducated in the past is associated with lower wages in current jobs irrespective of whether one is overeducated or not in them, are both statistically and economically significant for the first three lags.28 It is interesting to compare the magnitude of these “scarring” effects with those generated by past unemployment. While the drop in wage associated with the first lag of unemployment is found to be sizeably stronger than that associated with overeducation (7.7 vs. 4.2%), the penalty associated with past unemployment is also less persistent as further lags do not significantly affect wages.

These results show that the negative association between overeducation and future wages suggested by the raw wage data remains after including a rich set of controls, including measures of cognitive and non-cognitive ability. To the extent that, consistent with the cyclical upgrading literature (see, e.g., Bils and McLaughlin 2001), overeducated employment is likely to be more frequent during recessions, it follows that overeducation is a candidate mechanism behind the negative and persistent wage effects of graduating during a recession recently uncovered in the literature (Kahn 2010; Oreopoulos et al. 2012; Altonji et al. 2016; Liu et al. 2016).

5.2 Controlling for heterogeneity in unobserved ability

The estimation results discussed in the previous section show that overeducated employment spells are associated with lower current and future wages, even after controlling for cognitive and non-cognitive ability measures. However, one should exercise caution in interpreting these estimates in a causal fashion, as they may still suffer for an omitted variable bias. In this section we exploit the estimates from the duration model in Section 4 to control for residual differences in unobserved ability.29

To the extent that low-ability individuals tend to remain overeducated longer, for instance through lower arrival rates of non-overeducated employment offers, we can expect the unobserved heterogeneity types from the duration model to be correlated with the residual unobserved ability components that could cause an omitted variable bias in the augmented wage regression.30

In the following, we use this idea to correct for some of the potential residual unobserved ability bias in our augmented wage regression.

Specifically, we add a type-specific intercept and estimate the model by weighted least squares, using as weights the posterior type probabilities obtained from the mixed proportional hazard model. For each individual i in the sample, we compute the posterior probabilities of being of each type using Bayes’ rule. Namely, the predicted posterior probability of being of type r∈{1,2} is given by:

$$\begin{array}{@{}rcl@{}} \widehat{\pi_{ir}}&=&\frac{f\left(t_{i}|\mathbf{x}_{i},\nu=\nu_{r},\widehat{\alpha},\widehat{\theta}\right)\widehat{\pi_{r}}}{f\left(t_{i}|\mathbf{x}_{i},\widehat{\alpha},\widehat{\theta}\right)} \end{array} $$

where \(f(t_{i}|\mathbf {x}_{i},\nu =\nu _{r},\widehat {\alpha },\widehat {\theta })\) and \(f(t_{i}|\mathbf {x}_{i},\widehat {\alpha },\widehat {\theta })\) denote the density of the duration of the first overeducation spell t i evaluated at the estimated parameters \((\widehat {\alpha },\widehat {\theta })\), conditionally and unconditionally on being of type r. \(\widehat {\pi _{r}}\) is the estimated (unconditional) probability of being of type r. The heterogeneity types are primarily identified from instances in which the duration of the first overeducation spell goes in the opposite direction of what would be predicted using observable characteristics only. For example, high-AFQT individuals who remain overeducated for a long period of time will tend to have a high probability of being of type 1 (“low-hazard” type). Conversely, low-AFQT individuals who exit overeducation quickly will be more likely to be of type 2 (“high-hazard” type).

Estimation results are reported in Table 11.31 The first two panels correspond to the model without lagged overeducation regressors, while panels 3 and 4 incorporate four lags in overeducation status. In both cases we contrast our weighted least squares model (panels 1 and 3) with simple OLS (panels 2 and 4).

Table 11 Augmented log-wage regressions with heterogeneity types

In order to exploit the mixed proportional hazard model results, this model is estimated on the subsample of individuals with at least one reported overeducation spell. Observe first, that for all four specifications, the returns to required schooling and years of overeducation and the penalties to years of undereducation are lower than in the wage regressions of Table 10. This finding is consistent with overeducation being negatively correlated with ability. However, having been overeducated in the past still carries a wage penalty that is comparable to the one identified in the full sample. As shown in Table 11 (panel 3), these penalties for having been overeducated in the past remain significant, sizeable, and quantitatively similar to the estimates obtained without the inclusion of the type-specific intercept (panel 4) as well as those obtained using the full sample in Table 10. Although one cannot rule out the possibility that some alternative unobserved individual attributes might partly confound these estimates, the robustness of the results to the inclusion of the heterogeneity type-specific dummies does provide reassurance that the estimated earning penalties associated with past overeducation spells are not driven by selection on unobserved ability. As such, our results support the existence of scarring effects from past overeducation, reminiscent of what has been identified in the case of long-term unemployment (Arulampalam et al. 2001).32

6 Conclusions

Although economists and policymakers have long been concerned about the determinants and wage effects of overeducation, little is still known about its dynamics along the career. This paper combines data from the NLSY79 and CPS to provide the first analysis of the career dynamics of overeducated US workers. Overall, we find overeducation to be a persistent phenomenon, particularly for blacks and low-ability workers, associated with persistently and sizeably lower wages. The latter finding bears similarity with the scarring effects that have been found to accompany prolonged unemployment spells.

Controlling for dynamic selection on unobservable characteristics is key in this context. While the exit rate from overeducation decreases quickly over the duration of an overeducation spell, the estimation of a mixed proportional hazard model suggests that, after controlling for both observed and unobserved heterogeneity, true duration dependence is not strong. Rather, the propensity to exit overeducated employment appears to be very heterogeneous among workers.

We find that past unemployment is associated with a higher duration of future overeducation spells, thus indicating that overeducation is likely to be one of the mechanisms through which the scarring effects on earnings associated with unemployment spells operate. The scarring effects associated with overeducation could also account for some of the negative wage effects of graduating during a recession which have been recently uncovered in the literature.

From a policy standpoint, since both overeducation and unemployment (see, e.g., Saporta-Eksten 2014) are associated with negative and persistent wage shocks, a relevant question becomes whether a marginal increase in unemployment duration is more or less harmful, in terms of lifetime earnings, than entering an overeducation spell. The answer has bearing on the design of unemployment insurance—should early exit be encouraged at the cost of more mismatch?—and the appropriate evaluation of the performance of employment agencies.

In sum, our results show that overeducation is a complex phenomenon that involves a number of the classical ingredients of labor economics: human capital, search frictions, ability differences, and perhaps, compensating wage differentials. In order to quantify the importance of each mechanism and explore the effects of unemployment insurance or schooling subsidy programs, a promising avenue would be to estimate a dynamic model of schooling and occupational choice that would nest these different channels, while allowing for correlated unobserved heterogeneity in job mobility and productivity.

7 Endnotes

1 The overeducation incidence and returns to overschooling numbers cited in this paragraph are obtained by Leuven and Oosterbeek (2011) by averaging estimates from 151 studies.

2 The career mobility factor was initially investigated by Sicherman and Galor (1990). The general idea is that high-skilled workers may face higher promotion probabilities in low-skilled jobs. It follows that forward-looking individuals may choose to become overeducated.

3 Throughout the paper, we use “matched” as a shorthand for being neither overeducated nor undereducated for one’s job. While we choose to focus our analysis on overeducation, it is worth noting that undereducation is also relatively common, in particular among individuals who have completed 14 years of education, and, as such, has the potential to account for some of the wage dispersion within this schooling category.

4 In practice we restrict our analysis to individuals who have completed their highest level of education over the sample period (1982–1994), so that all changes in overeducation status result from a change in employment rather than from a change in schooling attainment. See, e.g., Pavan (2011), Flabbi and Moro (2012) and Joubert (2015) who estimate dynamic models of occupational choice with search frictions.

5 See Gottschalk and Hansen (2003) who show that, in a model with two sectors, two skills (college and non-college) and heterogeneous preferences for each sector, some college workers may choose to work in the non-college sector. Uncertainty in returns to schooling is another channel which could generate persistence in overeducation, since some individuals may find it ex post optimal to be overeducated (see Lee et al. 2015).

6 Several studies have used German, British, Canadian or Australian data to estimate panel wage regressions (Bauer 2002; Frenette 2004), dynamic random effect models of overeducation exit (Mavromaras et al. 2013) or simply document overeducation status transitions (Dolton and Vignoles 2000). Pollmann-Schult and Büchel (2004) estimate a hazard model to investigate the effect of different covariates on overeducation duration for German vocational school graduates, but do not attempt to evaluate duration dependence itself.

7 Using this type of conservative measure of mismatch allows us to mitigate the risk of misclassification, which could arise if two or more occupations with different required levels of education were aggregated up at the 3-digit level. Depending on the aggregation level used for the educational attainment, our measure of overeducation yields incidence levels of between 18% and 25% among all workers, and as much as about 40% among college graduates.

8 Augmented wage regressions, pioneered by Duncan and Hoffman (1981) replace the usual years-of-education regressor with three terms: years of required education in the current occupation, years of education in excess of that required level and years of education below that required level. The corresponding coefficients are interpreted as returns to required education, returns to overeducation and returns to undereducation. In a similar spirit, a number of recent papers stress the importance of relaxing the homogeneous and linear returns to schooling assumption in the classical Mincer regression (see, e.g., Belzil and Hansen 2002; Heckman et al. 2006).

9 Of the 12,686 individuals interviewed in the initial 1979 wave, these data provide information for respondents on a yearly basis from 1979 to 1994 and biyearly afterwards. The initial wave is comprised of a core civilian cross-section of 6,111 and an oversample of 5,295 black, Hispanic, and economically disadvantaged individuals born between January 1, 1957 and Dec. 31 1964. This is further supplemented by a military sample of 1,280 individuals born in the same period. We only keep the core cross-sectional sample of the NLSY79 in order to maintain a consistent sample between the NLSY79 and the CPS.

10 We restrict our sample to these years because it is the largest contiguous period where the NLSY79 reports the 1980 Census Occupation codes on an annual basis. We use the 1980 codes because they better reflect the set of occupations available over the period of interest than the 1970 codes.

11 In practice, we use the occupation and hourly wage corresponding to the current or most recent job at the time of the interview. When individuals hold multiple jobs at the same time, we use the occupation and wage corresponding to the job in which the respondent worked the most hours. We adjust for inflation by reporting all wages in constant dollars and then drop the top and bottom 2.5% of the reported person-year wages for every survey wave.

12 We investigated the sensitivity of the distribution of required schooling levels to the CPS waves used to construct the requirements. Specifically, we reconstructed the schooling requirements associated with each occupation using the pooled monthly samples of the 1983–1985 waves of the CPS. Less than 1% of the observations used in our analysis correspond to an occupation where the required schooling would differ when using this alternative measure.

13 Finite sample variability should not be a major concern here given the large number of observations (on average above 1,000) which are used to estimate the mode of each occupation. See, e.g., Dutta and Goswami (2010), who show that, for a Bernoulli distribution with sample size larger than 100, the mode of the empirical distribution matches the population mode with a probability close to 0.9.

14 Specifically, we considered four alternative definitions of over (and under)-education, using (i) the mode only (i.e. no cutoff), (ii) a 5% cutoff, (iii) a 10% cutoff and (iv) a 20% cutoff. Results from these alternative specifications are qualitatively similar to those obtained using our baseline definition.

15 It is also possible that errors in the occupation codes recorded in each interview of the NLSY79 could generate artificial transitions between overeducation statuses. In that case, our estimates of overeducation persistence could still be interpreted as lower bounds.

16 In practice we use a partial maximum likelihood estimator, clustering standard errors at the individual level. The resulting inference is robust to serial correlation in the unobserved determinants of overeducation.

17 One notable exception is the early analysis by Frank (1978).

18 Using a self-reported measure of overeducation, Dolton and Vignoles (2000) also find that its incidence among a 1980 cohort of U.K. university graduates decreased over time, from 38 to 30% 6 years after graduating.

19 With a different measure of overeducation and using CPS data, Rubb (2003) obtains a level of persistence for overeducated individuals of 73%.

20 Note that some of the transitions into overeducation from undereducated or matched employment are likely to mask non-employment spells occurring between two consecutive interviews.

21 For any given year t, these hazard rates are computed as the number of individuals leaving overeducated employment in year t, divided by the number of individuals who are still overeducated at the beginning of year t.

22 In other words, they may have been overeducated in several different jobs or unemployed at different points of that first overeducation spell.

23 Restricting the analysis to the subsample of individuals who have been overeducated at least once within the first three years of labor market entry does not significantly affect our results (results available from the authors upon request).

24 The NLSY79 allows us to compute the required level of education only for the current or most recent job at the time of each interview (the “CPS job”). To keep the exposition simple, the duration model we present in the text ignores the existence of job spells that might have occurred in between two interviews. We also estimated a model that explicitly takes into account these between-interview employment spells, treating the corresponding overeducation status as missing, which resulted in negligible differences in the estimation results (available upon request).

25 We estimated the model with more than two types, but including these additional types did not significantly improve the fit of the model. It resulted in higher BIC criteria than the model with two types while attributing similar values for the key parameters.

26 For the occupations such that the discrepancy between the frequency of the mode and the second most frequent schooling level is less than 15 percentage points, individuals whose schooling level falls with the range defined by the two most frequent schooling levels are assumed to be matched, and years of required schooling is set equal to their actual level of education. Individuals whose schooling level is higher (lower) than the upper (lower) bound of the aforementioned range are assumed to be overeducated (undereducated), and years of required schooling is set equal to that upper (lower) bound.

27 Only 3.1% of the individuals are excluded from our analysis from this selection step, suggesting that selection into employment is unlikely to be of first-order empirical importance.

28 Older lags, on the other hand, do not generate significant additional penalties (results available from the authors upon request).

29 Alternative approaches to correcting for the correlation between unobserved ability and overeducation status present serious challenges in this context. The instrumental variables approach requires a valid instrument for required schooling in addition to the usual instrument for schooling. In the fixed effect approach, the returns to schooling are identified off of individuals changing jobs with different required schooling levels, and, in particular, switching across matched, undereducated and overeducated jobs. These transitions are likely to be correlated with changes in unobserved wage determinants. Besides, the fixed effect estimates may not be generalizable to those who never switch overeducation status.

30 Note that, consistent with low-ability individuals being less likely to leave overeducation, AFQT has a positive and significant effect on the hazard rate out of overeducation (see Section 4).

31 In this model, duration of the first overeducation spell effectively plays the role of an exclusion restriction, which affects wages only indirectly through the heterogeneity types.

32 We re-estimated the duration model and wage regressions after excluding workers with 18 years of schooling or more from the sample (results available from the authors upon request). The results of the duration model are quantitatively very similar. In the augmented wage regressions, penalties for current and past overeducation are still present and even larger, with overeducated workers receiving negligibly small returns from their years of schooling beyond the required schooling level.

8 Appendix

8.1 CPS data

The 1989–1991 monthly CPS survey has a sample target of 50,000 households split into eight representative subsamples, each of which is interviewed for the first and last four months of a 16-month period. In any given month, a new sample of 6250 households is surveyed for the first time. As a result, the pooled monthly CPS data from January of 1989 through December of 1991 contain 268,750 unique households.

From these pooled cross-sections we keep only observations in the age range spanned by the NLSY79 cohort, which leaves 795,631 observations. Then we drop observations where an individual is unemployed, does not report a Census occupation code, has a missing level of education, did not complete the reported level of education, or is enrolled in college.

After making these cuts, we are left with a sample of 506,930 occupation and education level pairs, where the education level is defined as the highest grade achieved by the surveyed individual.

From this sample we estimate the required level of education for each occupation identified by its 3-digit Census occupation code. The required level of education is defined as the sample mode of the distribution of education levels among workers in the occupation. Then we match observed occupations in the NLSY79 to a required level of education based on their 3-digit occupation code. 125 out of 488 occupations are observed less than 100 times in our CPS pooled sample. In order to reduce the sampling variance of the corresponding required levels of education, we collapse these low-frequency occupations using 2-digit codes rather than 3-digit codes before applying the procedure described above. Less than 2% of our NLSY79 observations are in such occupations.