Is everybody an expert? An investigation into the impact of professional versus user reviews on movie revenues

  • 224 Accesses

  • 1 Citations


This study is the first attempt to examine the effect of electronic word of mouth (user reviews) relative to expert reviews on moviegoing decisions. For the first time, we use time-varying data on expert reviews. We find that expert ratings matter much more for moviegoing decisions than user ratings and volume. Our data also show that experts tend to be more critical but more consistent in their reviews than users. We find that experts, but not eWOM, affect wide release moviegoing, contrary to industry thinking. Finally, we show that experts’ reviews matter most when consumers and critics are in closer agreement about the quality of the film. The study uses OLS as well as instrumental variables analysis to account for possible endogeneity.

This is a preview of subscription content, log in to check access.

Access options

Buy single article

Instant unlimited access to the full article PDF.

US$ 39.95

Price includes VAT for USA

Subscribe to journal

Immediate online access to all issues from 2019. Subscription will auto renew annually.

US$ 99

This is the net price. Taxes to be calculated in checkout.

Fig. 1
Fig. 2


  1. 1.

    Conflicting findings occur even in the context of the same industry (e.g. movies): Liu (2006) and Duan et al. (2008) find that the volume but not the valence of consumer reviews is significantly associated with movie revenues. Chintagunta et al. (2010) find that it is valence of eWOM, rather than volume, that drives revenues. Also, the elasticities calculated vary a great deal. A few studies find negative elasticities, in particular in some movie and book studies, and You et al. (2015) suggest that “poor ratings can result in sales especially because the marginal cost of these products is so low” (ibid. p. 34). We are not sure how to reconcile these findings—our results seem to suggest a robust positive elasticity for volume and valence even in the presence of expert reviews.

  2. 2.

    In our dataset, there are 194 movies, each followed for 10 weeks. Thus there are 1940 potential observations—movie-week data points. However, there are 1629 observations in the OLS regressions due to some movies leaving the market before the full 10 weeks as well as missing observations for other variables.

  3. 3.

    Source- We should note that the theatrical market has not changed much in the last 10 years in terms of admissions and real revenues (see Theatrical Market Statistics at except the change to digital projection. The other change in distribution towards streaming does not affect the impact of user and professional critics on weekly revenues which we are studying here.

  4. 4.

    We note that Gopinath et al. (2013) identify μi using a two-stage approach which controls for time invariant unobservables but their identification assumption is based on the standard random effects approach. First they estimate box office in different markets at different points in time incorporating movie fixed effects. Second, they regress the movie fixed effects coefficients from the first stage on time invariant regressors to recover μi. However, this approach to identifying time invariant regressors relies on the assumption that they are not correlated with any unobserved movie specific effects (Greene 2011a) which reduces to the standard assumptions of a random effects model (Greene 2011a, b).

  5. 5.

    The effect of age is not separately identified from the constant in Eq. (2) when differenced.

  6. 6.

    We reject the null hypothesis that our endogenous regressors are exogenous with \(\chi^{2} \left( 5 \right) = 71.623\) and associated p value < 0.0000.

  7. 7.

    We should note that Liu (2006) and Babic et al. (2016) find that box office is more sensitive to user volume than valence. However, one important difference between our approach and Liu’s (2006) is that Liu (2006) uses measures only from the previous week and does not consider cumulative variables. While we do not display here for brevity, our results in Models 1 and 2 are similar to Liu’s (2006) when we use the same weekly variables. The approach we take in the paper is similar to the research of Chintagunta et al. (2010) and Gopinath et al. (2013) in that we use cumulative measures for our variables (e.g. total user volume up to week t − 1 rather than user volume only in week t − 1).

  8. 8.

    We reject the null hypothesis that our endogenous regressors are exogenous with \(\chi^{2} \left( 7 \right) = 50.274\) and associated p value < 0.000.

  9. 9.

    Testing the null hypothesis that the endogenous regressors are exogenous in Model 6 yields \(\chi^{2} \left( 7 \right) = 27.437\) and associated p value = 0.0003.

  10. 10.

    As a robustness check we ran the estimations below using the less than or equal to constraint for classifying the relatively larger disagreement observations. Our results are qualitatively similar to those presented below.

  11. 11.

    Testing the null hypothesis that the endogenous regressors are exogenous in Model 8 yields \(\chi^{2} \left( 7 \right) = 23.893\) and associated p value = 0.0012; similarly, Model 10 yields \(\chi^{2} \left( 7 \right) = 33.725\) and associated p value < 0.0000.

  12. 12.

    Private conversation with one of the authors and a New York Times movie reviewer.

  13. 13.

    We add 1 before taking the log to ensure the variable is defined when the level is 0.

  14. 14.

    While critics’ ratings may be endogenous, we should emphasize that critics are assigned to review a movie by their editor, and studios do not have a role in that decision.

  15. 15. keeps track of each critic’s review history and can be found by clicking on the critic’s name. The review history contains all previous reviews the critic has done including the date of the review.

  16. 16.

    We add 1 before taking the log to ensure the variable is defined when the level is 0.

  17. 17.

    We thank an anonymous referee for pointing this out.

  18. 18.

    To provide additional evidence of this assertion we calculate the GMM distance statistic (Hayashi 2000) to directly test the validity of all instruments created using the lagged level of advertising. The GMM distance statistic compares the Hansen’s J of two models—one including all instruments and one excluding the instruments created using lagged level advertising—where exogeneity of all instruments is the null hypothesis. Under the null, the test statistic is Chi square with degrees of freedom equal to the number of suspect instruments. The alternative hypothesis is that the suspect instruments are correlated with the error term and therefore invalid. This test relies on the assumption that the instruments not being tested are indeed exogenous. We believe this is likely given the theoretical arguments and the literature support we provide above. The GMM distance statistic testing the exogeneity of lagged level advertising is \(\chi^{2} \left( 1 \right) = 0.557\) with a p value = 0.4554. Failure to reject the null provides additional support that lagged level advertising is exogneous.

  19. 19.

    We thank an anonymous referee for suggesting this approach.

  20. 20.

    We note that we find significant heteroscedasticity in each first-stage estimation using the Wald test for groupwise heteroskedasticity proposed by Greene (2011a, b) for panel data.

  21. 21.

    Additional instruments created using \(AvgPrevRev_{it}\) for both critical valence and variance were never significant.

  22. 22.

    We note that impact of critic experience on \(\Delta CriticRating\) is consistent with our priors since the negative coefficient on \(Log\left( {1 + AvgPrevRev_{it} } \right)\) dominates the positive coefficient on \(AvgPrevRev_{it}\) for all values of average experience in our dataset.


  1. Babic, A., Sotgiu, F., de Valck, K., & Bijmolt, T. H. (2016). The effect of electronic word of mouth on sales: A meta-analytic review of platform, product, and metric factors. Journal of Marketing Research, 53(3), 297–318.

  2. Baker, A. M., Donthu, N., & Kumar, V. (2016). Investigating how word of mouth conversations aboutbrands influence purchase and retransmission intentions. Journal of Marketing Research, LIII, 225–239.

  3. Basuroy, S., Subimal Chatterjee, S., & Ravid, A. (2003). How critical are critical reviews? The box office effects of film critics, star power, and budgets. Journal of Marketing, 67, 103–117.

  4. Cameron, A. C., & Trivedi, P. K. (2005). Microeconometrics: Methods and applications. Cambridge: Cambridge University Press.

  5. Chen, Y., Liu, Y., & Zhang, J. (2012). When do third-party product reviews affect firm value and what can firms do? The case of media critics and professional movie reviews. Journal of Marketing, 76(2), 116–134.

  6. Chevalier, J. A., & Mayzlin, D. (2006). The effect of word of mouth on sales: Online book reviews. Journal of Marketing Research, 43(3), 345–354.

  7. Chintagunta, P. K., Gopinath, S., & Venkataraman, S. (2010). The effects of online user reviews on movie box office performance: Accounting for sequential rollout and aggregation across local markets. Marketing Science, 29(5), 944–957.

  8. De Vany, A., & Walls, W. D. (1999). Uncertainty in the movies: Can star power reduce the terror at the box office? Journal of Cultural Economics, 23(4), 235–318.

  9. De Vany, A., & Walls, W. D. (2002). Does hollywood make too many R-rated movies? Risk, stochastic dominance, and the illusion of expectation. Journal of Business, 75(3), 425–451.

  10. De Vries, L., Gensler, S., & Leeflang, P. S. H. (2017). Effects of traditional advertising and social messages on brand-building metrics and customer acquisition. Journal of Marketing, 81(5), 1–15.

  11. Duan, W., Bin, G., & Whinston, A. B. (2008). The dynamics of online word-of-mouth and product sales—An empirical investigation of the movie industry. Journal of Retailing, 84(2), 233–242.

  12. Einav, L. (2007). Seasonality in the U.S. motion picture industry. Rand Journal of Economics, 38(1), 127–145.

  13. Elberse, A. (2007). The power of stars: Do star actors drive the success of movies? Journal of Marketing, 71(4), 102–120.

  14. Elberse, A., & Anand, B. (2007). The effectiveness of pre-release advertising for motion pictures: An empirical investigation using a simulated market. Information Economics and Policy, 19(3), 319–343.

  15. Elberse, A., & Eliashberg, J. (2003). Demand and supply dynamics behavior for sequentially released products in international markets: The case of motion pictures. Marketing Science, 22(3), 329–354.

  16. Eliashberg, J., & Shugan, S. M. (1997). Film critics: Influencers or predictors? Journal of Marketing, 61(April), 68–78.

  17. Gopinath, S., Chintagunta, P. K., & Venkataraman, S. (2013). Blogs, advertising, and local-market movie box office performance. Management Science, 59(12), 2635–2654.

  18. Greene, W. H. (2011a). Econometric analysis (7th ed.). Upper Saddle River: Prentice Hall.

  19. Greene, W. H. (2011b). Fixed effects vector decomposition: A magical solution to time-invariant variables in fixed effects models? Political Analysis, 19(2), 135–146.

  20. Hayashi, F. (2000). Econometrics. Princeton University Press.

  21. Ho, J. Y. C., Dhar, T., & Weinberg, C. B. (2009). Playoff-payoff: Superbowl advertising for movies. International Journal of Research in Marketing, 26(3), 168–179.

  22. Holbrook, M. B. (1999). Popular appeal versus expert judgments of motion pictures. Journal of Consumer Research, 26(September), 144–155.

  23. Houston, M., Kupfer, A.-K., Hennig-Thurau, T., & Spann, M. (2018). Pre-release consumer buzz. Journal of the Academy of Marketing Science, 46(2), 338–360.

  24. Imbens, G. W. (2002). Generalized method of moments and empirical likelihood. Journal of Business and Economic Statistics, 20, 493–506.

  25. Klein, L. R., & Ford, G. T. (2003). Consumer search for information in the digital age: An empirical study of prepurchase search for automobiles. Journal of Interactive Marketing, 17(3), 29–49.

  26. KRC Research. (2012). Buy it, try it, rate it. Available at, Accessed August 2, 2015.

  27. Kupfer, A. K., Pähler vor der Holte, N., Kübler, R. V., & Hennig-Thurau, T. (2018). The role of the partner brand’s social media power in brand alliances. Journal of Marketing, 82(3), 25–44.

  28. Lee, R. S. (2013). Vertical integration and exclusivity in platform and two-sided markets. American Economic Review, 103(7), 2960–3000.

  29. Lewbel, A. (2012). Using heteroscedasticity to identify and estimate mismeasured and endogenous regressor models. Journal of Business & Economic Statistics, 30(1), 67–80.

  30. Liu, Y. (2006). Word-of-mouth for movies: Its dynamics and impact on box office revenue. Journal of Marketing, 70(3), 74–89.

  31. Maity, M., Dass, M., & Malhotra, N. K. (2014). The antecedents and moderators of offline information search: A meta-analysis. Journal of Retailing, 90(2), 233–254.

  32. Marchand, A., Hennig-Thurau, T., & Wiertz, C. (2017). Not All digital word of mouth is created equal: Understanding the respective impact of consumer reviews and microblogs on new product success. International Journal of Research in Marketing, 34(2), 336–354.

  33. Mayzlin, D., Dover, Y., & Chevalier, J. A. (2014). Promotional reviews: An empirical investigation of online review manipulation. American Economic Review, 104(8), 2421–2455.

  34. Moon, S., Bergey, P. K., & Iacobucci, D. (2010). Dynamic effects among movie ratings, movie revenues, and viewer satisfaction. Journal of Marketing, 74(1), 108–121.

  35. Moorthy, S., Ratchford, B. T., & Talukdar, D. (1997). Consumer information search revisited: Theory and empirical analysis. Journal of Consumer Research, 23(4), 263–277.

  36. Muchnik, L., Aral, S., & Taylor, S. J. (2013). Social influence bias: A randomized experiment. Science, 341, 647–651.

  37. Neelamegham, R., & Chintagunta, P. (1999). A Bayesian model to forecast new product performance in domestic and international markets. Marketing Science, 18(2), 115–136.

  38. PBS Newshour. (2015). Spotting the fakes among the five-star reviews. Available at, Accessed September 7, 2015.

  39. Prendergast, C., & Stole, L. (1996). Impetuous youngsters and jaded old-timers: Acquiring a reputation for learning. Journal of Political Economy, 104(6), 1105–1134.

  40. Radas, S., & Shugan, S. M. (1998). Seasonal marketing and timing new product introductions. Journal of Marketing Research, 35(3), 296–315.

  41. Rao, V. R., Abraham, S., Basuroy, S., Gretz, R., & Chen, J. (2017). The impact of advertising content on movie revenues. Marketing Letters, 28, 341–355.

  42. Ratchford, B. T., Talukdar, D., & Lee, M.-S. (2007). The impact of the internet on consumers’ use of information sources for automobiles: A re-inquiry. Journal of Consumer Research, 34(1), 111–119.

  43. Ravid, S. A. (1999). Information, blockbusters, and stars: A study of the film industry. Journal of Business, 72(4), 463–492.

  44. Ravid, S. A., Wald, J. K., & Basuroy, S. (2006). Distributors and critics: Does it takes two to Tango? Journal of Cultural Economics, 30(3), 201–218.

  45. Roodman, D. (2009). A note on the theme of too many instruments. Oxford Bulletin of Economics and Statistics, 71(1), 135–158.

  46. Rossi, P. E. (2014). Invited paper—Even the rich can make themselves poor: A critical examination of IV methods in marketing applications. Marketing Science, 33(5), 655–672.

  47. Sass, E. (2013). Most marketers will spend more on social media in 2014. Available at, Accessed September 4, 2015.

  48. Scharfstein, D. S., & Stein, J. C. (1990). Herd behavior and investment. American Economic Review, 80(3), 465–479.

  49. Urbany, J. E., Dickson, P. R., & Wilkie, W. L. (1989). Buyer uncertainty and information search. Journal of Consumer Research, 16(2), 208–215.

  50. Vogel, H. L. (2007). Entertainment industry economics (7th ed.). Cambridge: Cambridge University Press.

  51. You, Ya., Vadakkepatt, G. G., & Joshi, A. M. (2015). A meta-analysis of electronic word-of-mouth elasticity. Journal of Marketing, 79(2), 19–39.

  52. Zhang, X., & Dellarocas, C. (2006). The lord of the ratings: Is a movie’s fate influenced by reviews? In ICIS 2006 proceedings, 1959–1978.

  53. Zwiebel, J. (1995). Corporate conservatism and relative compensation. Journal of Political Economy, 103(1), 1–25.

Download references


Suman Basuroy thanks the Carl De Santis Center for Motion Picture Industry Studies for partially supporting this project with a grant. Avri Ravid thanks Rutgers Business School for a research grant partially supporting this work. We thank participants in the annual Business and Economics Scholars Workshop in Motion Picture Industry Studies as well as participants in the second behavioral economics workshop at Tel Aviv College for comments and suggestions. All errors remain our own.

Author information

Correspondence to Suman Basuroy.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


Appendix 1: Instrumental variables and first-stage estimations

Instrumental variables

Rossi (2014) in a recent critical overview of instrumental variable analysis highlights the need for researchers to adequately identify the potential sources of omitted variable bias (i.e. endogeneity issues) and discuss why the chosen instruments should be considered exogenous from the estimation equation but related to the independent variables. We address these issues below. Then we discuss additional instruments generated leveraging heteroskedasticity in the first-stage estimations (Lewbel 2012) to further aid in identification.

There are seven potentially endogenous variables in our analysis: \(UserVolume\), \(UserRating\), \(UserVariance\), \(CriticRating\), \(CriticVariance\), Screen, and Advs. We note that we are not concerned with correlation between the number of critic reviews (\(CriticVolume\)) and the error term because critics are typically assigned to review a film well in advance of the film’s release dateFootnote 12—it is highly unlikely that editors when assigning critics to review films consider predictions of changes in a film’s unobservable characteristics weeks after its release.

While we control for time invariant film characteristics with first differencing, we still need to control for possible time variant endogeneity. For example, both critical and user reviews may be driven by the characteristics of the focal film relative to other films on the market. The focal film’s relative “quality” may change over the 10-week run as the set of other films on the market changes every week. Differential relative “quality” of the same movie may drive different average ratings from both critics and users as well as induce a higher volume of internet reviews which consequently can generate higher box office revenues. This results in endogeneity concerns for variance of critic and user ratings as well since they are a function of their respective averages. Similarly, different relative “quality” may impact studio distribution and advertising strategies. Chintagunta et al. (2010) and Gopinath et al. (2013) express similar concerns in their study and identify viable instruments for three of these variables, namely UserVolume, \(UserRating\), and Screen.

We follow Neelamegham and Chintagunta (1999) in identifying an instrument for Screen using the average number of screens that show movies of the same genre as the focal movie i in week t (\(CompScreen_{it}\)). However, we deviate from Chintagunta et al. (2010) in our choice of instruments for \(UserVolume\) and \(UserRating\) since our data are national rather than local.

Instruments need to be correlated with the suspected endogenous variable but not with the error term in Eq. (2), \(\Delta \varepsilon_{it}\) (Greene 2011a; Rossi 2014). One benefit of first differencing to account for time invariant film-specific effects is that we can use lagged levels of our endogenous variables as instruments (Greene 2011a): “without group effects, there is a simple instrumental variables estimator available. Assuming that the time series is long enough, one could use the lagged differences … or the lagged levels … (p. 308)”. Lagged levels are appropriate as long as they are not correlated with \(\Delta \varepsilon_{it}\) and influence the independent variable (Greene 2011a; Rossi 2014). For \(UserVolume\) and \(UserRating\) we believe users are highly unlikely to consider forecasts of \(\Delta \varepsilon_{it}\) when (1) deciding to leave a review and (2) evaluating the film. Also, Moon et al. (2010) show that previous user ratings influence future user ratings. Additionally, in a recent article published in Science, Muchnik et al. (2013) show in a randomized experiment that social influence works on users in that prior ratings of others significantly affected individual ratings. Indeed, one problem of using lagged levels as instruments is that they are typically weak with little correlation to the first difference (Greene 2011a). However, as we show in our first-stage estimations below, this is not a concern in our dataset—the lagged levels of \(UserVolume\) and \(UserRating\) have significant explanatory power on their counterparts in the first-stage estimations.

We leverage the longitudinal aspect of our data set to obtain additional instruments which can help identify \(UserVolume\). For film i observed in week t after release, we find the average number of user reviews in the tth week after release for all movies released before film i from a different genre. For example, the \(UserVolume\) of a movie observed in its 3rd week will be instrumented by the average 3rd week \(UserVolume\) of all movies from a different genre released before the focal movie. A similar approach is used in Lee’s (2013) study of the video game industry when obtaining instruments for console and game prices. This measure will be uncorrelated with \(\Delta \varepsilon_{it}\) by construction since users are very unlikely to consider forecasts of future films’ \(\Delta \varepsilon\), especially future films in different genres, when deciding to leave their reviews.

For each film in each week t, we obtain the average UserVolume in the tth week of all previously released movies in a different genre (\(PrevUserVolumeDiffGen\)). We use the change in average tth from the \(t - 1{\text{th}}\) week to the tth week of all previously released movies in a different genre (\(\Delta PrevUserVolumeDiffGen\)). We expect a positive correlation with the change in UserVolume as the instrument likely captures industry wide changes in the amount of reviews consumers typically leave for films from week to week. Another concern highlighted by Rossi (2014) is the use of instruments that may not vary among groups (e.g. price indices). It is important to note that there is variation in this instrument by film and over time since very seldom we find two films of the same genre released at the same time in our dataset.

\(UserVariance\) is likely to be identified by instruments for \(UserRating\) because the latter is a nonlinear function of the former. To address this nonlinear relationship, we include the natural log of the lagged level of \(UserRating\), \(Log\left( {1 + UserRating} \right)\),Footnote 13 as an additional instrument for UserValence.

The valence of professional critic reviews, \(CriticRating\) may also be endogenous since critical reviews should be correlated with unobserved relative movie “quality.” We use an instrument for critics’ ratings designed to capture critic experience.Footnote 14 More experienced critics can be in a different position vis-a-vis corporate headquarters. There is an entire literature in finance and economics suggesting that people with more experience may require higher incentives to act in accordance with shareholders’ values (Prendergast and Stole 1996; Scharfstein and Stein 1990; Zwiebel 1995). In line with this research, Ravid et al. (2006) argue that professional movie critics with a better reputation/more experience exhibit stronger corporate biases than others. As such, we expect average critics’ experience to be negatively correlated with the change in average critical rating.

Reviewers’ experience is almost by definition completely uncorrelated with the unobserved relative “quality” of the movie. It is difficult to obtain data on critics’ tenure and experience, but as a proxy we are able to obtain previous reviews on RT.Footnote 15 We postulate that the greater the number of reviews, the greater the experience. For each critic, we find the number of reviews posted prior to reviewing the focal film. Then we find the level and the natural log of the average for all critics who review the focal film. Though we expect a positive impact of experience on the change in average critical rating, we include the natural log to address a possible nonlinear relationship. These values, \(AvgPrevRev_{it}\) and \(Log\left( {1 + AvgPrevRev_{it} } \right)\), change as new critical reviews become available throughout a movie’s run.Footnote 16 Note that we do not use the lagged level of \(CriticRating\) as an instrument because it is unlikely that professional/well-trained critics coordinate their reviews taking into account previous critical assessment of a film (although this remains an empirical question).Footnote 17

As with \(UserVariance\) and \(UserRating\), we expect \(CriticVariance\) is in part identified by instruments for \(CriticRating\). However, we generate additional internal instruments leveraging heteroskedasticity in the first-stage estimation of \(CriticVariance\) (Lewbel 2012) to help separately identify the impact of the variance of critic rating on box office. We discuss this in greater depth below.

Another potential endogenous variable is advertising. We follow Chintagunta et al. (2010) and note that (a) prerelease advertising accounts for the vast majority of advertising spending in the movie industry (e.g., Elberse and Anand 2007 find that 88% of television advertising spending was spent prior to initial release), (b) prerelease advertising budgets are typically a fixed proportion of production budget (see Ravid 1999; Vogel 2007). Finally, any impact of prerelease advertising would be captured by \(\mu_{i}\) since this value is unchanged over the run of the film—the estimations control for this through first differencing. However, changes in advertising expenditure over the course of a film’s run may be influenced by the unobserved relative “quality” of competing films. As an instrument we leverage the panel structure of the data and use the level of advertising expenditure in the previous week. Since the endogenous variable in the estimation equation is the lagged first difference of advertising, the instrument is the 2-period lagged level of advertising. This is a valid instrument as it is unlikely that firms set advertising expenditures in a given week considering the change in forecasted future shocks two or more periods out, \(\Delta \varepsilon_{it}\). In the movie industry, as is well established in the marketing literature, the key emphasis is on buzz creation and brand awareness (Houston et al. 2018). Advertising is less likely to have such long-term effects in this industry. For example, De Vries et al. (2017) find that empirically only one lag of advertising mattered in their VARX model exploring the influence of several variables (including traditional advertising) on their contemporaneous counterparts.Footnote 18

Additional instruments leveraging heteroskedasticity in the first-stage estimations

We note that the empirical results below are robust to including only the traditional instruments we outline above. However, in our empirical setting it is difficult to strongly identify all endogenous variables using traditional external instruments alone. Identification improves when we leverage the procedure outlined in Lewbel (2012) and augment the traditional instruments with internal instruments that exploit heteroskedasticity in the first-stage estimations.Footnote 19

Lewbel (2012) shows that if heteroskedasticity is present in the first-stage estimations,Footnote 20 then additional instruments can be created by interacting the residuals of the respective first-stage estimation with any (or all) demeaned independent variables included in the first-stage estimation. The key idea is that while the exogenous regressors in the first-stage estimation are uncorrelated with the error term in the first-stage regression by construction, there is no reason to believe that the residuals will be independent of the regressors in the reduced form estimation. If the residuals are heteroskedastic (i.e. dependent on the regressors), then this information can be used to further untangle the endogenous part of the offending variable from the exogenous part. In fact, more heteroskedasticity aids in identifying the endogenous regressor (Lewbel 2012).

In the extreme, Lewbel (2012) shows that models are identified using only heteroskedasticity in first-stage estimations without any additional instruments, though “[t]he resulting identification is based on higher moments and so likely to provide less reliable estimates than identification based on standard exclusion restrictions, but may be useful in applications where traditional instruments are not available or could be used along with traditional instruments to increase efficiency” (p. 67). We follow this advice and use any additional instruments along with the traditional instruments we discuss below.

Additional instruments are created first by estimating the first-stage regression including all exogenous variables and instruments and obtaining residuals. Then the residuals are interacted with demeaned values of the relevant regressors from the first-stage estimation to create new variables. Any new variable created using this method is then included like a standard instrument in instrumental variable analysis.

We do not include extra instruments generated from the residuals of all first-stage estimations interacted with all exogenous variables to avoid instrument proliferation which may weaken results (Roodman 2009). Rather, we focus on generating instruments for endogenous variables that may be difficult to identify using external instruments alone. Our main concern is with identifying \(\Delta UserVolume\), \(\Delta UserVariance\), \(\Delta CriticRating\), \(\Delta CriticVariance\), and ∆Screen. We believe the impact of the variance of user opinion will be difficult to separately identify from \(UserRating\) and \(UserVolume\) (given \(UserVariance\) is a function of both) with external instruments only. It also may be difficult to separately identify critic rating and variance of critic rating for similar reasons; this is likely compounded by the fact that we have only two external instruments (\(AvgPrevRev_{it}\) and \(Log\left( {1 + AvgPrevRev_{it} } \right)\)) for both endogenous variables. Lastly, we augment the single instrument for \(\Delta Screen\), which is not based on lagged level of the variable as the instrument for advertising is, to aid in identification.

For \(\Delta UserVariance\) we create an additional instrument by interacting the demeaned \(\Delta PrevUserVolumeDiffGen\) (\({\text{DM}}\left( {\Delta PrevUserVolumeDiffGen} \right)\) where \({\text{DM}}\left( \bullet \right)\) is the demeaning operator) with the residuals from the preliminary first-stage estimation of lagged first differenced \(UserVariance\) (\(Res\Delta UserVariance_{it - 1}\)). Our expectation is that the size of the variation in the variance of user opinion changes as more users leave reviews—we expect \(\Delta PrevUserVolumeDiffGen\) will capture this since it is a relevant instrument for \(UserVolume\). We include the interaction of the first-stage residuals for \(\Delta UserVolume\) with its demeaned lag level to help separately identify from \(\Delta UserVariance\). For \(\Delta CriticRating\) and \(\Delta CriticVariance\), we create two additional instruments by interacting the residuals from both preliminary first-stage estimations with demeaned \(Log\left( {1 + AvgPrevRev_{it} } \right)\) because it is likely that critic experience influences both critic rating and the size of the variation in critic rating.Footnote 21 Additionally, we create another Lewbel (2012) style instrument for \(\Delta CriticRating\) using lagged level user rating. The key idea is that the level of disagreement among critics may be related to the popular appeal of the film. Also as the variance in critical opinion may be related to the popularity of the film, we include another instrument for \(\Delta CriticVariance\) using lagged level user volume. Finally, we use lagged level user volume to create another instrument for \(\Delta Screen\). This is a relevant instrument if studios alter their distribution intensity in response to online chatter and if the variation in the size of the response varies with the amount of online chatter.

First-stage estimations

We list the descriptive statistics for the traditional external instruments as well as the Lewbel (2012) style instruments in Table 11. We show the first-stage estimations for our full model in Table 12. The first-stage results indicate the instruments are relevant (all first-stage F-statistics are well above 10) and the signs generally conform to priors: \(\Delta UserVolume\), \(\Delta UserRating\), and \(\Delta Log\left( {1 + Advs} \right)\) are significantly identified by using lagged levels; \(\Delta PrevUserVolumeDiffGen\) has a positive and significant impact on \(\Delta UserVolume\); critical experience captured in \(AvgPrevRev_{it}\) and \(Log\left( {1 + AvgPrevRev_{it} } \right)\) is significantly related \(\Delta CriticRating\)Footnote 22 and \(\Delta CriticVariance\); \(\Delta CompScreen\) significantly identifies ∆Screen.

Table 11 Descriptive statistics for instrumental variables
Table 12 First-stage regressions. The dependent variable in each estimation is listed across the first row

Finally, the additional instruments created leveraging heteroskedasticity in the preliminary first-stage estimations aid in identification. They significantly impact their respective endogenous variables, and they are all significant in other first-stage estimations. Note that we do not make predictions of the signs of these additional instruments since we do not have theoretical guidance on the impact of higher moments (Lewbel 2012). However, the significance of these variables suggests they are related to heteroskedasticity in the first-stage estimations (which is what is required for identification).

Appendix 2: Robustness check of main model using and data

We obtain data on critical evaluation from and consumer evaluation from for each movie included in the analysis in a similar fashion as the RT data. We use the and data to calculate variance in critic and user valence within a week rather than the variance in average evaluation from week to week, something we are not able to do with the RT data. We estimate the main models in the paper (Table 7, Models 3 and 4) using this data, along with the more precise measures of variance. Descriptive statistics for these variables and for the instruments used in this robustness check are in Table 13 (note that we are able to identify the endogenous variables in instrumental variable estimation with fewer Lewbel style instruments); estimation results are in Table 14. The results are qualitatively similar to the findings in the text based on the RT data.

Table 13 Descriptive statistics for and data
Table 14 OLS and instrument variables robustness check regression results using GMM estimations

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Basuroy, S., Abraham Ravid, S., Gretz, R.T. et al. Is everybody an expert? An investigation into the impact of professional versus user reviews on movie revenues. J Cult Econ (2019).

Download citation


  • eWOM volume
  • eWOM valence
  • Expert reviews
  • Movies