Abstract
Because consumer reviews leverage the wisdom of the crowd, the way in which they are aggregated is a central decision faced by platforms. We explore this “rating aggregation problem” and offer a structural approach to solving it, allowing for (1) reviewers to vary in stringency and accuracy, (2) reviewers to be influenced by existing reviews, and (3) product quality to change over time. Applying this to restaurant reviews from Yelp.com, we construct an adjusted average rating and show that even a simple algorithm can lead to large information efficiency gains relative to the arithmetic average.
Similar content being viewed by others
Notes
Luca (2011) shows that Yelp consumers respond directly to the average rating even though it is coarser than the underlying information. The importance of the average rating is also supported by an online survey we conducted for this study. In this survey, we ask subjects to report their general use and understanding of restaurant ratings, without mentioning Yelp. Out of the 239 respondents, 93.7% use the average rating to choose a restaurant, but a much lower percentage of respondents said that they pay attention to other review information such as the number of reviews, rating trends, or reviewer profile. More details of the survey are presented in Section 6.
Yelp had 167 million unique visitors per month in the first quarter of 2016 (source: http://www.yelp.com/factsheet), but according to a 2011 blog post of Yelp, journalist Susan Kuchinskas estimated that “only 1 percent of users will actively create content. Another 9 percent, the editors, will participate by commenting, rating, or sharing the content. The other 90 percent watch, look, and read without responding.” (https://www.yelpblog.com/2011/06/yelp-and-the-1990-rule, accessed on June 5, 2016).
We assume that horizontal preferences of reviewers affects reviewer stringency. Hence, when we present the vertical quality to general readers of Yelp reviews, we should benchmark the adjusted average to the stringency of one type of reviewers. We choose to benchmark it to a reviewer with average attributes. The construction of reviewer horizontal preference is presented in Section 2.1.
The “Elite” status is a badge displayed next to the reviewer name, and is rewarded by Yelp to prolific reviewers who write high quality reviews.
Based on hotel reservation data from Travelocity.com, which include consumer-generated reviews from Travelocity.com and TripAdvisor.com, Ghose et al. (2012) estimate consumer demand for various product attributes and then rank products according to estimated “expected utility gain.”
Readers interested in consumer usage of Yelp reviews can refer to Luca (2011), who combines the same Yelp data as in this paper with restaurant revenue data from Seattle. More generally, there is strong evidence that consumer reviews are an important source of information in a variety of settings. Chevalier and Mayzlin (2006) find predictive power of consumer rating on book sales. Both Godes and Mayzlin (2004) and Duan, Gu, and Duan et al. (2008) find the spread of word-of-mouth affects sales by bringing the consumer awareness of consumers; the former measure the spread by the “the dispersion of conversations across communities” and the latter by the volume of reviews. Duan et al. (2008) argue that after the endogenous correlation among ratings, online user reviews have no significant impact on movies’ box office revenues.
We assume that a reviewer submits one review for a restaurant. Therefore, the order of the review indicates the reviewer’s identity. On Yelp.com, reviewers are only allowed to display one review per restaurant.
Some reviewers are by nature generous and obtain psychological gains from submitting reviews that are more favorable than what they actually feel. In this case, 𝜃 r n > 0 represents leniency.
The correlation structure is detailed in Appendix A.
Note that the martingale assumption entails two features in the stochastic process: first, conditional on \(\mu _{rt_{n-1}}\), \(\mu _{rt_{n}}\) is independent of the past signals \(\{s_{rt_{1}},...,s_{rt_{n-1}}\}\); second, conditional on \(\mu _{rt_{n}}\), \(s_{rt_{n}}\) is independent of the past signals \(\{s_{rt_{1}},...,s_{rt_{n-1}}\}\). These two features greatly facilitate reviewer n’s Bayesian estimate of restaurant quality. This is also why we choose martingale over other statistical processes (such as AR(1)).
The cuisine indicators describe whether a restaurant is traditional American, new American, European, Mediterranean, Latin American, Asian, Japanese, seafood, fast food, lounge, bar, bakery/coffee, vegetarian, or others. They are not mutually exclusive. The five price categories are (1,2,3,4) as defined by Yelp plus a missing price category (which we code as 0).
By construction, the sample mean of each factor is normalized to 0 and sample variance normalized to 1.
If reviewer i has not reviewed any restaurant yet, we set her taste equal to the mean characteristics of restaurants (C i t = 0).
We have tried to estimate the model that allows ρ i to vary by reviewer attributes other than elite status, but none of other attributes significantly affect ρ based on the likelihood ratio test.
Our model of time trend in addition to year fixed effects is consistent with Godes and Silva (2012), who suggest that the negative temporal trend may be due to the fact that that reviewers are becoming more critical and more negative in general, and they find that after conditioning on the year a review was written, ratings increase over time. In our case, the time trend since the first review of a restaurant remains negative after we control for year fixed effects.
Note that this decline is in addition to the random walk evolution of restaurant quality because the martingale deviation is assumed to have a mean of zero.
We define the raw age by calendar days since a restaurant’s first review on Yelp and normalize the age variable in our estimation by (raw age-548)/10. We choose to normalize age relative to the 548th day because the downward trend of reviews is steeper in a restaurant’s early reviews and flattens at roughly 1.5 years after the first review.
A summary of the statistical data generating process is available in Appendix A.
The parameters to be estimated are \(\{\mu _{r0}\}_{r = 1}^{R},\) σ ξ , (σ e ,σ n e ), (ρ e ,ρ n e ), (α y e a r t , α n u m r e v , α f r e q r e v , α m a t c h d , α t a s t e v a r , λ (e−n e)0, β a g e1, β a g e2, β n u m r e v , β f r e q r e v , β m a t c h d , β t a s t e v a r ), and (α a g e1,α a g e2). In an extended model, we also allow {σ e ,σ n e , α a g e1, α a g e2, σ ξ } to differ for ethnic and non-ethnic restaurants.
This is relative to the review submitted 1.5 years after the first review, because age is normalized by (raw age - 548)/10.
The estimation of \(E(\mu _{rt_{n}}|s_{rt_{1}},s_{rt_{2}},..,s_{rt_{n}})\) is detailed in Appendix A.
In Hu et al. (2009), ratings are found to follow bimodal distributions on Amazon (with many one and five stars) and the paper attributed this to the tendency to review when opinions are extreme. We do not find the bimodal distribution pattern on Yelp that Hu et al. (2009) provide as evidence of significant reviewer selection.
Reviews identified by Yelp as fake reviews are removed from the Yelp pages. We do not observe these reviews and do not consider them in our analysis.
We can potentially predict the elite status using past activities on Yelp of a reviewer, but since we do not observe how many rating the sample reviewers have left outside Seattle, we cannot reliably predict elite status.
Note that the martingale evolution of restaurant quality implies an increasing variance around the restaurant’s fixed effect, while positive social incentives implies a decreasing variance.
Specifically, we have e x p((A I C B a y e s i a n −A I C L i m i t e d A t t e n t i o n )) = e x p((l o g L B a y e s i a n −l o g L L i m i t e d A t t e n t i o n )/2) = 46, 630.
We create these figures by simulating a large number of ratings according to the underlying model, and then computing adjusted versus simple average of ratings at each time of review.
In the simulation with full model specifications, the assumption for restaurant age affecting restaurant quality or reviewer bias is nonessential for comparing the mean absolute errors of the two aggregating methods. Adjusted average always corrects any bias in reviewer bias, and simple average always reflects the sum of the changes in quality and reviewer bias.
There is a large literature on social image and social influence, with most evidence demonstrated in lab or field experiments. For example, Ariely et al. (2009) show that social image is important for charity giving and private monetary incentives partially crowd out the image motivation.
References
Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6), 716–723.
Akerlof, G.A. (1980). A theory of social custom, of which unemployment may be one consequence. Quarterly Journal of Economics, 94(4), 749–75.
Alevy, J.E., Haigh, M.S., & List, J.A. (2007). Information cascades: evidence from a field experiment with financial market professionals. The Journal of Finance, 62(1), 151–180.
Ariely, D., Bracha, A., & Meier, S. (2009). Doing good or doing well? Image motivation and monetary incentives in behaving prosocially. American Economic Review, 99(1), 544–555.
Banerjee, A.V. (1992). A simple model of herd behavior. The Quarterly Journal of Economics, 107(3), 797–817.
Bénabou, R., & Tirole, J. (2006). Incentives and prosocial behavior. American Economic Review, 96(5), 1652–1678.
Bikhchandani, S., Hirshleifer, D., & Welch, I. (1992). A Theory of Fads, Fashion, Custom and Cultural Change as Informational Cascades. Journal of Political Economy, 100(5), 992–1026.
Brown, J., Hossain, T., & Morgan, J. (2010). Shrouded attributes and information suppression: evidence from the field. Quarterly Journal of Economics, 125(2), 859–876.
Chen, Y., Maxwell Harper, F., Konstan, J., & Li, S.X. (2010). Social comparison and contributions to online communities: a field experiment on MovieLens. American Economic Review, 100(4), 1358–1398.
Chevalier, J.A., & Mayzlin, D. (2006). The effect of word of mouth on sales: online book reviews. Journal of Marketing Research, 43(3), 345–354.
Duan, W., Bin, G., & Whinston, A.B. (2008). Do online reviews matter? an empirical investigation of panel data. Decision Support Systems, 45(4), 1007–1016.
Eyster, E., & Rabin, M. (2010). Naïve Herding in Rich-Information Settings. American Economic Journal: Microeconomics, 2(4), 221–243.
Fradkin, A., Grewal, E., & Holtz, D. (2017). he Determinants of Online Review Informativeness: Evidence from Field Experiments on Airbnb. working paper.
Ghose, A., Ipeirotis, P., & Li, B. (2012). Designing Ranking Systems for Hotels on Travel Search Engines by Mining User-Generated and Crowd-Sourced Content. Marketing Science.
Glazer, J., McGuire, T.G., Cao, Z., & Zaslavsky, A. (2008). Using global ratings of health plans to improve the quality of health care. Journal of Health Economics, 27(5), 1182–95.
Godes, D., & Mayzlin, D. (2004). Using Online Conversations to Study Word-of-Mouth Communication. Marketing Science, 23(4), 545–560.
Godes, D., & Silva, J.C. (2012). Sequential and temporal dynamics of online opinion. Marketing Science, 31(3), 448–473.
Hu, N., Zhang, J., & Pavlou, P. (2009). Overcoming the J-shaped distribution of product reviews. Communication ACM.
Li, X., & Hitt, L. (2008). Self-selection and information role of online product reviews. Information Systems Research, 19(4), 456–474.
Ljungqvist, L., & Sargent, T.J. (2012). Recursive macroeconomic theory, 3Edition. Cambridge: MIT Press.
Luca, M. (2011). Reviews, Reputation, and Revenue: The Case of Yelp.com. Harvard Business School working paper.
Luca, M., & Smith, J. (2013). Salience in Quality Disclosure: Evidence from The US News College Rankings. Journal of Economics & Management Strategy.
Luca, M., & Zervas, G. (2016). Fake it till you make it: reputation, competition, and Yelp review fraud. Management Science.
Mayzlin, D., Dover, Y., & Chevalier, J.A. (2014). Promotional Reviews: an Empirical Investigation of Online Review Manipulation. American Economic Review.
Miller, N., Resnick, P., & Zeckhauser, R.J. (2005). Eliciting informative feedback: the peer- prediction method. Management Science, 51(9), 1359–1373.
Moe, W.W., & Trusov, M. (2011). The value of social dynamics in online product ratings forums. Journal of Marketing Research, 48(3), 444–456.
Moe, W.W., & Schweidel, D.A. (2012). Online product opinions: incidence, evaluation, and evolution. Marketing Science, 31(3), 372–386.
Muchnik, L., Aral, S., & Taylor, S.J. (2013). Social influence bias: a randomized experiment. Science, 341(6146), 647–651.
Nosko, C., & Tadelis, S. (2015). The Limits of reputation in platform Markets: an empirical analysis and field experiment. NBER Working Paper No. 20930, January 2015.
Pope, D. (2009). Reacting to rankings: evidence from America’s best hospitals. Journal of Health Economics, 28(6), 1154–1165.
Wang, Q., Goh, K.Y., & Lu, X. (2012). How does user generated content influence consumers’ new product exploration and choice diversity? An empirical analysis of product reviews and consumer variety seeking behaviors. Working paper.
Wang, Z. (2010). Anonymity, Social Image, and the Competition for Volunteers: A Case Study of the Online Market for Reviews. The B.E. Journal of Economic Analysis & Policy.
Welch, G., & Bishop, G. (2001). An introduction to the Kalman filter. In Proceedings of the Siggraph Course, Los Angeles.
Wu, C., Che, H., Chan, T.Y., & Lu, X. (2015). The Economic Value of Online Reviews. Marketing Science, 34(5), 739–754.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendices for aggregation of consumer ratings: an application to Yelp.com
Appendix A: Model of reviewer incentives to deviate from prior reviews
In this appendix, we show an alternative model to capture reviewer incentive to differentiate from prior ratings corresponding to our baseline model in Section 2.1. It gives rise to exactly the same equation except for ρ i < 0.
If social incentives motivate reviewer i to deviate from prior reviews, we can model it as reviewer i choosing to report \(x_{rt_{n}}\) to minimize a slightly different objective:
where \(E(\mu _{rt_{n}}|x_{rt_{1}},x_{rt_{2}},...x_{rt_{n-1}})\) is the posterior belief of true quality given all the prior ratings (not counting i’s own signal) and w i > 0 is the marginal utility that reviewer i will get by reporting differently from prior ratings. By Bayes’ Rule, \(E(\mu _{rt_{n}}|x_{rt_{1}},x_{rt_{2}},...x_{rt_{\{n-1\}}},s_{rt_{n}})\) is a weighted average of \(E(\mu _{rt_{n}}|x_{rt_{1}},x_{rt_{2}},...x_{rt_{n-1}})\) and i’s own signal \(s_{rt_{n}}\), which we can write as, \(E(\mu _{rt_{n}}|x_{rt_{1}},x_{rt_{2}},...x_{rt_{\{n-1\}}},s_{rt_{n}})=\alpha \cdot s_{rt_{n}}+(1-\alpha )\cdot \) \(E(\mu _{rt_{n}}|x_{rt_{1}},x_{rt_{2}},...x_{rt_{n-1}})\). Combining this with the first order condition of \(F_{rn}^{(2)}\), we have
if we redefine \(\lambda _{rn}=\frac {1}{1-w_{i}}\theta _{rn}\) and \(\rho _{i}=-\frac {w_{i}}{(1-w_{i})(1-\alpha )}\). Note that the optimal ratings in the above two scenarios are written in exactly the same expression except that ρ i > 0 if one tries to be close to the best guess of the true restaurant quality in her report and ρ i < 0 if one is motivated to deviate from prior ratings. The empirical estimate of ρ i will inform us which scenario is more consistent with the data. In short, weight ρ i is an indicator of how a rating correlates with past ratings. As long as later ratings contain information from past ratings, aggregation needs to weigh early and late reviews differently.
Appendix B: Notes on the data generating process
3.1 B.1 Data generating process
The model presented in Section 2.1 includes random change in restaurant quality, random noise in reviewer signal, reviewer heterogeneity in stringency, social incentives, and signal precision, and a quadratic time trend, as well as the quality of the match between the reviewer and the restaurant. Overall, one can consider the data generation process as the following three steps:
-
1.
Restaurant r starts with an initial quality μ r0 when it is first reviewed on Yelp. Denote this time as time 0. Since time 0, restaurant quality μ r evolves in a random walk process by calendar time, where an i.i.d. quality noise \(\xi _{t}\sim N(0,\sigma _{\xi }^{2})\) is added on to restaurant quality at t so that μ r t = μ r(t− 1) + ξ t .
-
2.
A reviewer arrives at restaurant r at time t n as r’s n th reviewer. She observes the attributes and ratings of all the previous n − 1 reviewers of r. She also obtains a signal \(s_{rt_{n}}=\mu _{rt_{n}}+\epsilon _{rn}\) of the concurrent restaurant quality where the signal noise \(\epsilon _{rn}\sim N\left (0,\sigma _{\epsilon }^{2}\right )\).
-
3.
The reviewer chooses an optimal rating that gives weights to both her own experience and her social incentives. The optimal rating takes the form
$$x_{rt_{n}}=\lambda_{rn}+\rho_{n}E(\mu_{rt_{n}}|x_{rt_{1}},x_{rt_{2}},..,x_{rt_{2}},...,s_{rt_{n}})+(1-\rho_{n})s_{rt_{n}} $$where \(E(\mu _{rt_{n}}|x_{rt_{1}},x_{rt_{2}},..,x_{rt_{2}},...,s_{rt_{n}})\) is the best guess of the restaurant quality at t n by Bayesian updating.
-
4.
The reviewer is assumed to know the attributes of all past reviewers so that she can de-bias the stringency of past reviewers. The reviewer also knows that the general population of reviewers may change taste from year to year (captured in year fixed effects {α y e a r t }), and there is a quadratic trend in λ by restaurant age (captured in {α a g e1,α a g e2}). This trend could be driven by changes in reviewer stringency or restaurant quality and these two drivers are not distinguishable in the above expression for \(x_{rt_{n}}\).
In the Bayesian estimate of \(E(\mu _{rt_{n}}|x_{rt_{1}},x_{rt_{2}},..,x_{rt_{2}},...,s_{rt_{n}})\), we assume the n th reviewer of r is fully rational and has perfect information about the other reviewers’ observable attributes, which according to our model determines the other reviewers’ stringency (λ), social preference (ρ), and signal noise (σ 𝜖 ). With this knowledge, the n th reviewer of r can back out each reviewer’s signal before her; thus the Bayesian estimate of \(E(\mu _{rt_{n}}|x_{rt_{1}},x_{rt_{2}},..,x_{rt_{2}},...,s_{rt_{n}})\) can be rewritten as \(E(\mu _{rt_{n}}|s_{rt_{1}},...s_{rt_{n}})\). Typical Bayesian inference implies that a reviewer’s posterior about restaurant quality is a weighted average of previous signals and her own signal, with the weight increasing with signal precision. This is complicated by the fact that restaurant quality evolves by a martingale process, and therefore current restaurant quality is better reflected in recent reviews. Accordingly, the Bayesian estimate of \(E(\mu _{rt_{n}}|s_{rt_{1}},...s_{rt_{n}})\) should give more weight to more recent reviews even if all reviewers have the same stringency, social preference, and signal precision. The analytical derivation of \(E(\mu _{rt_{n}}|s_{rt_{1}},...s_{rt_{n}})\) is presented in Appendix A.
3.2 B.2 Deriving \(\frac {@@}{@@}E(\mu _{rt}|s_{rt_{1}},...s_{rt_{n}})\)
For restaurant r, denote the prior belief of \(\mu _{rt_{n}}\) right before the realization of the n th signal as
and we assume that the first reviewer uses an uninformative prior
Denote the posterior belief of \(\mu _{rt_{n}}\) after observing \(s_{rt_{n}}\) as
Hence
where \(f(s_{rt_{n}}|\mu _{rt_{n}},s_{rt_{1}},...s_{rt_{n-1}})=f(s_{rt_{n}}|\mu _{rt_{n}})\) comes from the assumption that \(s_{rt_{n}}\) is independent of past signals conditional on \(\mu _{rt_{n}}\).
In the above formula, the prior belief of \(\mu _{rt_{n}}\) given the realization of \(\{s_{rt_{1}},...,s_{rt_{n-1}}\}\), or \(\pi _{n|n-1}(\mu _{rt_{n}})\), depends on the posterior belief of \(\mu _{rt_{n-1}}\), \(h_{n-1|n-1}(\mu _{rt_{n-1}})\) and the evolution process from \(\mu _{rt_{n-1}}\) to \(\mu _{rt_{n}}\), denoted as g(μ n |μ n− 1). Hence,
Given the normality of π n|n− 1, \(f(s_{rt_{n}}|\mu _{rt_{n}})\) and g(μ n |μ n− 1), \(h_{n|n}(\mu _{rt_{n}})\) is distributed normal. In addition, denote μ n|n and \(\sigma _{n|n}^{2}\) as the mean and variance for random variable with normal probability density function \(p_{n|n-1}(\mu _{rt_{n}})\), μ n|n− 1 and \(\sigma _{n|n-1}^{2}\) are the mean and variance of random variable with normal pdf \(h_{n|n}(\mu _{rt_{n}})\). After combining terms in the derivation of \(p_{n|n-1}(\mu _{rt_{n}})\) and \(h_{n|n}(\mu _{rt_{n}})\), the mean and variance evolves according to the following rule:
Hence, we can deduct the beliefs from the initial prior,
\(E(\mu _{rt_{n}}|s_{rt_{1}},...s_{rt_{n}})=\mu _{n|n}\) is derived recursively following the above formulation.
3.3 B.3 The correlation of ratings induced by quality change
We assume quality evolution follows a martingale process: μ r t = μ r(t− 1) + ξ t , where t denotes the units of calendar time since restaurant r has first been reviewed and the t-specific evolution ξ t conforms to \(\xi _{t}\sim i.i.d\ N\left (0,\sigma _{\xi }^{2}\right )\). This martingale process introduces a positive correlation of restaurant quality over time,
which increases with the timing of the earlier date (t) but is independent of the time between t and t ′.
Recall that \(x_{rt_{n}}\) is the n th review written at time t n since r was first reviewed. We can express the n th reviewer’s signal as:
Signal noise 𝜖 r n is assumed to be i.i.d. with \(Var(s_{rt_{n}}|\mu _{rt_{n}})={\sigma _{i}^{2}}\) where i is the identity of the n th reviewer. The variance of restaurant quality at t n conditional on quality at t n− 1 is,
Note that the martingale assumption entails two features in the stochastic process: first, conditional on \(\mu _{rt_{n-1}}\), \(\mu _{rt_{n}}\) is independent of the past signals \(\{s_{rt_{1}},...,s_{rt_{n-1}}\}\); second, conditional on \(\mu _{rt_{n}}\), \(s_{rt_{n}}\) is independent of the past signals \(\{s_{rt_{1}},...,s_{rt_{n-1}}\}\). These two features greatly facilitate reviewer n’s Bayesian estimate of restaurant quality.
Appendix C: Deriving the likelihood function
4.1 C.1 Deriving the likelihood function \(\frac {@@}{@@}f(x_{rt_{2}}-x_{rt_{1}},...,x_{rt_{N_{r}}}-x_{rt_{N_{r}-1}})\)
Because the covariance structure of \(\{x_{rt_{2}}-x_{rt_{1}},x_{rt_{3}}-x_{rt_{2}},...,x_{rt_{N_{r}}}-x_{rt_{N_{r}-1}}\}\) is complicated, we use the change of variable technique to express the likelihood \(f(x_{rt_{2}}-x_{rt_{1}},...,x_{rt_{N_{r}}}-x_{rt_{N_{r}-1}})\) by \(f(s_{rt_{2}}-s_{rt_{1}},...,s_{rt_{N_{r}}}-s_{rt_{N_{r}-1}})\),
The derivation of \(f(x_{rt_{2}}-x_{rt_{1}},...,x_{rt_{N_{r}}}-x_{rt_{N_{r}-1}})\) is shown as the following,
-
Step 1: To derive \(f(s_{rt_{2}}-s_{rt_{1}},...,s_{rt_{N_{r}}}-s_{rt_{N_{r}-1}})\), we note that \(s_{rt_{n}}=\mu _{rt_{n}}+\epsilon _{n}\) and thus, for any m > n, n ≥ 2, the variance and covariance structure can be written as:
$$\begin{array}{@{}rcl@{}} &&Cov(s_{rt_{n}}-s_{rt_{n-1}},s_{rt_{m}}-s_{rt_{m-1}})\\ &&= Cov(\epsilon_{rn}-\epsilon_{rn-1}+\xi_{t_{n-1}+ 1}+...+\xi_{t_{n}},\epsilon_{rm}-\epsilon_{rm-1}+\xi_{t_{m-1}+ 1}+...+\xi_{t_{m}})\\ &&= \left\{\begin{array}{ll} -\sigma_{rn}^{2} & if\ m=n + 1\\ 0 & if\ m>n + 1 \end{array}\right.\\ && Var(s_{rt_{n}}-s_{rt_{n-1}})\\ &&= \sigma_{rn}^{2}+\sigma_{rn-1}^{2}+(t_{n}-t_{n-1})\sigma_{\xi}^{2}. \end{array} $$Denoting the total number of reviewers on restaurant r as N r , the vector of the first differences of signals as \({\Delta }s_{r}=\{s_{rt_{n}}-s_{rt_{n-1}}\}_{n = 2}^{N_{r}}\), and its covariance variance structure as \({\Sigma }_{\Delta s_{r}}\), we have
$$f({\Delta} s_{r})=(2\pi)^{-\frac{N_{r}-1}{2}}|{\Sigma}_{\Delta s_{r}}|^{-(N_{r}-1)/2}exp\left( -\frac{1}{2}{\Delta} s_{r}^{\prime}{\Sigma}_{\Delta s_{r}}^{-1}{\Delta} s_{r}\right). $$ -
Step 2: We derive the value of \(\{s_{rt},...s_{rt_{N_{r}}}\}_{r = 1}^{R}\) from observed ratings \(\{x_{rt_{1}},...x_{rt_{N_{r}}}\}_{r = 1}^{R}\). Given
$$x_{rt_{n}}=\lambda_{rn}+\rho_{n}E(\mu_{rt_{n}}|s_{rt_{1}},...s_{rt_{n}})+(1-\rho_{n})s_{rt_{n}} $$and \(E(\mu _{rt_{n}}|s_{rt},...s_{rt_{n}})\) as a function of \(\{s_{rt_{1}},...s_{rt_{n}}\}\) (formula in Appendix A), we can solve \(\{s_{rt_{1}},...s_{rt_{n}}\}\) from \(\{x_{rt_{1}},...x_{rt_{n}}\}\) according to the recursive formula in Appendix A.
-
Step 3: We derive |J Δs→Δx |− 1 or |J Δx→Δs |, where J Δx→Δs is such that
$$\left[\begin{array}{c} s_{rt_{2}}-s_{rt_{1}}\\ ...\\ s_{rt_{n}}-s_{rt_{n-1}} \end{array}\right]=J_{\Delta x\rightarrow{\Delta} s}\left[\begin{array}{c} x_{rt_{2}}-x_{rt_{1}}\\ ...\\ x_{rt_{n}}-x_{rt_{n-1}} \end{array}\right] $$the analytical form of J Δx→Δs is available given the recursive expression for \(x_{rt_{n}}\) and \(s_{rt_{n}}\).
4.2 C.2 Solving \(\frac {@@}{@@}\{s_{rt_{1}},...s_{rt_{n}}\}\) from observed ratings
Solve \(\{s_{rt_{1}},...s_{rt_{n}}\}\) from \(\{x_{rt_{1}},...x_{rt_{n}}\}\) according to the following recursive formula:
Appendix D: Tables
Appendix E: Figures
Appendix F: “Restaurant reviews beliefs survey” questionnaire
To test our model against external source of information, we conducted an online survey using Amazon Mturk (“Restaurant Reviews Beliefs Survey,” February 1, 2016) in which we asked how respondents used and comprehended restaurant ratings online (we didn’t mention Yelp in the survey). In total, 239 Mturk workers responded to our survey. The following shows the screen shot of the questionnaire.
Rights and permissions
About this article
Cite this article
Dai, W.(., Jin, G., Lee, J. et al. Aggregation of consumer ratings: an application to Yelp.com. Quant Mark Econ 16, 289–339 (2018). https://doi.org/10.1007/s11129-017-9194-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11129-017-9194-9