Aggregation of consumer ratings: an application to Yelp.com

Dai, Weijia (Daisy); Jin, Ginger; Lee, Jungmin; Luca, Michael

doi:10.1007/s11129-017-9194-9

Aggregation of consumer ratings: an application to Yelp.com

Published: 29 December 2017

Volume 16, pages 289–339, (2018)
Cite this article

Quantitative Marketing and Economics Aims and scope Submit manuscript

Weijia (Daisy) Dai¹,
Ginger Jin ORCID: orcid.org/0000-0001-7912-3780²,
Jungmin Lee³ &
…
Michael Luca⁴

3091 Accesses
47 Citations
13 Altmetric
1 Mention
Explore all metrics

Abstract

Because consumer reviews leverage the wisdom of the crowd, the way in which they are aggregated is a central decision faced by platforms. We explore this “rating aggregation problem” and offer a structural approach to solving it, allowing for (1) reviewers to vary in stringency and accuracy, (2) reviewers to be influenced by existing reviews, and (3) product quality to change over time. Applying this to restaurant reviews from Yelp.com, we construct an adjusted average rating and show that even a simple algorithm can lead to large information efficiency gains relative to the arithmetic average.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Lot of Slots – Outliers Confinement in Review-Based Systems

A Time and Opinion Quality-Weighted Model for Aggregating Online Reviews

The Recommendation Bias: The Effects of Social Influence on Individual Rating Behavior

Notes

Luca (2011) shows that Yelp consumers respond directly to the average rating even though it is coarser than the underlying information. The importance of the average rating is also supported by an online survey we conducted for this study. In this survey, we ask subjects to report their general use and understanding of restaurant ratings, without mentioning Yelp. Out of the 239 respondents, 93.7% use the average rating to choose a restaurant, but a much lower percentage of respondents said that they pay attention to other review information such as the number of reviews, rating trends, or reviewer profile. More details of the survey are presented in Section 6.
Yelp had 167 million unique visitors per month in the first quarter of 2016 (source: http://www.yelp.com/factsheet), but according to a 2011 blog post of Yelp, journalist Susan Kuchinskas estimated that “only 1 percent of users will actively create content. Another 9 percent, the editors, will participate by commenting, rating, or sharing the content. The other 90 percent watch, look, and read without responding.” (https://www.yelpblog.com/2011/06/yelp-and-the-1990-rule, accessed on June 5, 2016).
We assume that horizontal preferences of reviewers affects reviewer stringency. Hence, when we present the vertical quality to general readers of Yelp reviews, we should benchmark the adjusted average to the stringency of one type of reviewers. We choose to benchmark it to a reviewer with average attributes. The construction of reviewer horizontal preference is presented in Section 2.1.
The “Elite” status is a badge displayed next to the reviewer name, and is rewarded by Yelp to prolific reviewers who write high quality reviews.
Based on hotel reservation data from Travelocity.com, which include consumer-generated reviews from Travelocity.com and TripAdvisor.com, Ghose et al. (2012) estimate consumer demand for various product attributes and then rank products according to estimated “expected utility gain.”
Readers interested in consumer usage of Yelp reviews can refer to Luca (2011), who combines the same Yelp data as in this paper with restaurant revenue data from Seattle. More generally, there is strong evidence that consumer reviews are an important source of information in a variety of settings. Chevalier and Mayzlin (2006) find predictive power of consumer rating on book sales. Both Godes and Mayzlin (2004) and Duan, Gu, and Duan et al. (2008) find the spread of word-of-mouth affects sales by bringing the consumer awareness of consumers; the former measure the spread by the “the dispersion of conversations across communities” and the latter by the volume of reviews. Duan et al. (2008) argue that after the endogenous correlation among ratings, online user reviews have no significant impact on movies’ box office revenues.
We assume that a reviewer submits one review for a restaurant. Therefore, the order of the review indicates the reviewer’s identity. On Yelp.com, reviewers are only allowed to display one review per restaurant.
Some reviewers are by nature generous and obtain psychological gains from submitting reviews that are more favorable than what they actually feel. In this case, 𝜃 _{r
n} > 0 represents leniency.
The correlation structure is detailed in Appendix A.
Note that the martingale assumption entails two features in the stochastic process: first, conditional on $\mu _{rt_{n-1}}$, $\mu _{rt_{n}}$ is independent of the past signals $\{s_{rt_{1}},...,s_{rt_{n-1}}\}$; second, conditional on $\mu _{rt_{n}}$, $s_{rt_{n}}$ is independent of the past signals $\{s_{rt_{1}},...,s_{rt_{n-1}}\}$. These two features greatly facilitate reviewer n’s Bayesian estimate of restaurant quality. This is also why we choose martingale over other statistical processes (such as AR(1)).
The cuisine indicators describe whether a restaurant is traditional American, new American, European, Mediterranean, Latin American, Asian, Japanese, seafood, fast food, lounge, bar, bakery/coffee, vegetarian, or others. They are not mutually exclusive. The five price categories are (1,2,3,4) as defined by Yelp plus a missing price category (which we code as 0).
By construction, the sample mean of each factor is normalized to 0 and sample variance normalized to 1.
If reviewer i has not reviewed any restaurant yet, we set her taste equal to the mean characteristics of restaurants (C _{i
t} = 0).
We have tried to estimate the model that allows ρ _i to vary by reviewer attributes other than elite status, but none of other attributes significantly affect ρ based on the likelihood ratio test.
Our model of time trend in addition to year fixed effects is consistent with Godes and Silva (2012), who suggest that the negative temporal trend may be due to the fact that that reviewers are becoming more critical and more negative in general, and they find that after conditioning on the year a review was written, ratings increase over time. In our case, the time trend since the first review of a restaurant remains negative after we control for year fixed effects.
Note that this decline is in addition to the random walk evolution of restaurant quality because the martingale deviation is assumed to have a mean of zero.
We define the raw age by calendar days since a restaurant’s first review on Yelp and normalize the age variable in our estimation by (raw age-548)/10. We choose to normalize age relative to the 548th day because the downward trend of reviews is steeper in a restaurant’s early reviews and flattens at roughly 1.5 years after the first review.
A summary of the statistical data generating process is available in Appendix A.
The parameters to be estimated are $\{\mu _{r0}\}_{r = 1}^{R},$ σ _ξ, (σ _e,σ _{n
e}), (ρ _e,ρ _{n
e}), (α _{y
e
a
r
t}, α _{n
u
m
r
e
v}, α _{f
r
e
q
r
e
v}, α _{m
a
t
c
h
d}, α _{t
a
s
t
e
v
a
r}, λ _{(e−n
e)0}, β _{a
g
e1}, β _{a
g
e2}, β _{n
u
m
r
e
v}, β _{f
r
e
q
r
e
v}, β _{m
a
t
c
h
d}, β _{t
a
s
t
e
v
a
r}), and (α _{a
g
e1},α _{a
g
e2}). In an extended model, we also allow {σ _e,σ _{n
e}, α _{a
g
e1}, α _{a
g
e2}, σ _ξ} to differ for ethnic and non-ethnic restaurants.
This is relative to the review submitted 1.5 years after the first review, because age is normalized by (raw age - 548)/10.
The estimation of $E(\mu _{rt_{n}}|s_{rt_{1}},s_{rt_{2}},..,s_{rt_{n}})$ is detailed in Appendix A.
In Hu et al. (2009), ratings are found to follow bimodal distributions on Amazon (with many one and five stars) and the paper attributed this to the tendency to review when opinions are extreme. We do not find the bimodal distribution pattern on Yelp that Hu et al. (2009) provide as evidence of significant reviewer selection.
Reviews identified by Yelp as fake reviews are removed from the Yelp pages. We do not observe these reviews and do not consider them in our analysis.
We can potentially predict the elite status using past activities on Yelp of a reviewer, but since we do not observe how many rating the sample reviewers have left outside Seattle, we cannot reliably predict elite status.
Note that the martingale evolution of restaurant quality implies an increasing variance around the restaurant’s fixed effect, while positive social incentives implies a decreasing variance.
Specifically, we have e x p((A I C _{B
a
y
e
s
i
a
n}−A I C _{L
i
m
i
t
e
d
A
t
t
e
n
t
i
o
n})) = e x p((l o g L _{B
a
y
e
s
i
a
n}−l o g L _{L
i
m
i
t
e
d
A
t
t
e
n
t
i
o
n})/2) = 46, 630.
We create these figures by simulating a large number of ratings according to the underlying model, and then computing adjusted versus simple average of ratings at each time of review.
In the simulation with full model specifications, the assumption for restaurant age affecting restaurant quality or reviewer bias is nonessential for comparing the mean absolute errors of the two aggregating methods. Adjusted average always corrects any bias in reviewer bias, and simple average always reflects the sum of the changes in quality and reviewer bias.
There is a large literature on social image and social influence, with most evidence demonstrated in lab or field experiments. For example, Ariely et al. (2009) show that social image is important for charity giving and private monetary incentives partially crowd out the image motivation.

References

Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6), 716–723.
Article Google Scholar
Akerlof, G.A. (1980). A theory of social custom, of which unemployment may be one consequence. Quarterly Journal of Economics, 94(4), 749–75.
Article Google Scholar
Alevy, J.E., Haigh, M.S., & List, J.A. (2007). Information cascades: evidence from a field experiment with financial market professionals. The Journal of Finance, 62(1), 151–180.
Article Google Scholar
Ariely, D., Bracha, A., & Meier, S. (2009). Doing good or doing well? Image motivation and monetary incentives in behaving prosocially. American Economic Review, 99(1), 544–555.
Article Google Scholar
Banerjee, A.V. (1992). A simple model of herd behavior. The Quarterly Journal of Economics, 107(3), 797–817.
Article Google Scholar
Bénabou, R., & Tirole, J. (2006). Incentives and prosocial behavior. American Economic Review, 96(5), 1652–1678.
Article Google Scholar
Bikhchandani, S., Hirshleifer, D., & Welch, I. (1992). A Theory of Fads, Fashion, Custom and Cultural Change as Informational Cascades. Journal of Political Economy, 100(5), 992–1026.
Article Google Scholar
Brown, J., Hossain, T., & Morgan, J. (2010). Shrouded attributes and information suppression: evidence from the field. Quarterly Journal of Economics, 125(2), 859–876.
Article Google Scholar
Chen, Y., Maxwell Harper, F., Konstan, J., & Li, S.X. (2010). Social comparison and contributions to online communities: a field experiment on MovieLens. American Economic Review, 100(4), 1358–1398.
Article Google Scholar
Chevalier, J.A., & Mayzlin, D. (2006). The effect of word of mouth on sales: online book reviews. Journal of Marketing Research, 43(3), 345–354.
Article Google Scholar
Duan, W., Bin, G., & Whinston, A.B. (2008). Do online reviews matter? an empirical investigation of panel data. Decision Support Systems, 45(4), 1007–1016.
Article Google Scholar
Eyster, E., & Rabin, M. (2010). Naïve Herding in Rich-Information Settings. American Economic Journal: Microeconomics, 2(4), 221–243.
Google Scholar
Fradkin, A., Grewal, E., & Holtz, D. (2017). he Determinants of Online Review Informativeness: Evidence from Field Experiments on Airbnb. working paper.
Ghose, A., Ipeirotis, P., & Li, B. (2012). Designing Ranking Systems for Hotels on Travel Search Engines by Mining User-Generated and Crowd-Sourced Content. Marketing Science.
Glazer, J., McGuire, T.G., Cao, Z., & Zaslavsky, A. (2008). Using global ratings of health plans to improve the quality of health care. Journal of Health Economics, 27(5), 1182–95.
Article Google Scholar
Godes, D., & Mayzlin, D. (2004). Using Online Conversations to Study Word-of-Mouth Communication. Marketing Science, 23(4), 545–560.
Article Google Scholar
Godes, D., & Silva, J.C. (2012). Sequential and temporal dynamics of online opinion. Marketing Science, 31(3), 448–473.
Article Google Scholar
Hu, N., Zhang, J., & Pavlou, P. (2009). Overcoming the J-shaped distribution of product reviews. Communication ACM.
Li, X., & Hitt, L. (2008). Self-selection and information role of online product reviews. Information Systems Research, 19(4), 456–474.
Article Google Scholar
Ljungqvist, L., & Sargent, T.J. (2012). Recursive macroeconomic theory, 3Edition. Cambridge: MIT Press.
Google Scholar
Luca, M. (2011). Reviews, Reputation, and Revenue: The Case of Yelp.com. Harvard Business School working paper.
Luca, M., & Smith, J. (2013). Salience in Quality Disclosure: Evidence from The US News College Rankings. Journal of Economics & Management Strategy.
Luca, M., & Zervas, G. (2016). Fake it till you make it: reputation, competition, and Yelp review fraud. Management Science.
Mayzlin, D., Dover, Y., & Chevalier, J.A. (2014). Promotional Reviews: an Empirical Investigation of Online Review Manipulation. American Economic Review.
Miller, N., Resnick, P., & Zeckhauser, R.J. (2005). Eliciting informative feedback: the peer- prediction method. Management Science, 51(9), 1359–1373.
Article Google Scholar
Moe, W.W., & Trusov, M. (2011). The value of social dynamics in online product ratings forums. Journal of Marketing Research, 48(3), 444–456.
Article Google Scholar
Moe, W.W., & Schweidel, D.A. (2012). Online product opinions: incidence, evaluation, and evolution. Marketing Science, 31(3), 372–386.
Article Google Scholar
Muchnik, L., Aral, S., & Taylor, S.J. (2013). Social influence bias: a randomized experiment. Science, 341(6146), 647–651.
Article Google Scholar
Nosko, C., & Tadelis, S. (2015). The Limits of reputation in platform Markets: an empirical analysis and field experiment. NBER Working Paper No. 20930, January 2015.
Pope, D. (2009). Reacting to rankings: evidence from America’s best hospitals. Journal of Health Economics, 28(6), 1154–1165.
Article Google Scholar
Wang, Q., Goh, K.Y., & Lu, X. (2012). How does user generated content influence consumers’ new product exploration and choice diversity? An empirical analysis of product reviews and consumer variety seeking behaviors. Working paper.
Wang, Z. (2010). Anonymity, Social Image, and the Competition for Volunteers: A Case Study of the Online Market for Reviews. The B.E. Journal of Economic Analysis & Policy.
Welch, G., & Bishop, G. (2001). An introduction to the Kalman filter. In Proceedings of the Siggraph Course, Los Angeles.
Wu, C., Che, H., Chan, T.Y., & Lu, X. (2015). The Economic Value of Online Reviews. Marketing Science, 34(5), 739–754.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Lehigh University, Bethlehem, PA, 18015, USA
Weijia (Daisy) Dai
University of Maryland & NBER, College Park, MD, 20742, USA
Ginger Jin
Seoul National University, Seoul, South Korea
Jungmin Lee
Harvard Business School, Boston, MA, 02163, USA
Michael Luca

Authors

Weijia (Daisy) Dai
View author publications
You can also search for this author in PubMed Google Scholar
Ginger Jin
View author publications
You can also search for this author in PubMed Google Scholar
Jungmin Lee
View author publications
You can also search for this author in PubMed Google Scholar
Michael Luca
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ginger Jin.

Appendices

Appendices for aggregation of consumer ratings: an application to Yelp.com

Appendix A: Model of reviewer incentives to deviate from prior reviews

In this appendix, we show an alternative model to capture reviewer incentive to differentiate from prior ratings corresponding to our baseline model in Section 2.1. It gives rise to exactly the same equation except for ρ _i < 0.

If social incentives motivate reviewer i to deviate from prior reviews, we can model it as reviewer i choosing to report $x_{rt_{n}}$ to minimize a slightly different objective:

$$F_{rn}^{(2)}=(x_{rt_{n}}-(s_{rt_{n}}+\theta_{rn}))^{2}-w_{i}[x_{rt_{n}}-E(\mu_{rt_{n}}|x_{rt_{1}},x_{rt_{2}},...x_{rt_{n-1}})]^{2} $$

where $E(\mu _{rt_{n}}|x_{rt_{1}},x_{rt_{2}},...x_{rt_{n-1}})$ is the posterior belief of true quality given all the prior ratings (not counting i’s own signal) and w _i > 0 is the marginal utility that reviewer i will get by reporting differently from prior ratings. By Bayes’ Rule, $E(\mu _{rt_{n}}|x_{rt_{1}},x_{rt_{2}},...x_{rt_{\{n-1\}}},s_{rt_{n}})$ is a weighted average of $E(\mu _{rt_{n}}|x_{rt_{1}},x_{rt_{2}},...x_{rt_{n-1}})$ and i’s own signal $s_{rt_{n}}$, which we can write as, $E(\mu _{rt_{n}}|x_{rt_{1}},x_{rt_{2}},...x_{rt_{\{n-1\}}},s_{rt_{n}})=\alpha \cdot s_{rt_{n}}+(1-\alpha )\cdot $ $E(\mu _{rt_{n}}|x_{rt_{1}},x_{rt_{2}},...x_{rt_{n-1}})$. Combining this with the first order condition of $F_{rn}^{(2)}$, we have

$$\begin{array}{@{}rcl@{}} x_{rt_{n}}^{(2)} &=& \frac{1}{(1-w_{i})} \theta_{rn}+\frac{1-\alpha+w_{i}\alpha}{(1-w_{i})(1-\alpha)}s_{rt_{n}}\\ &&-\frac{w_{i}}{(1-w_{i})(1-\alpha)}E(\mu_{rt_{n}}|x_{rt_{1}},x_{rt_{2}},..x_{rt_{n-1}},s_{rt_{n}})\\ & = & \lambda_{rn}+(1-\rho_{i})s_{rt_{n}}+\rho_{i}E(\mu_{rt_{n}}|x_{rt_{1}},x_{rt_{2}},..x_{rt_{n-1}},s_{rt_{n}}) \end{array} $$

if we redefine $\lambda _{rn}=\frac {1}{1-w_{i}}\theta _{rn}$ and $\rho _{i}=-\frac {w_{i}}{(1-w_{i})(1-\alpha )}$. Note that the optimal ratings in the above two scenarios are written in exactly the same expression except that ρ _i > 0 if one tries to be close to the best guess of the true restaurant quality in her report and ρ _i < 0 if one is motivated to deviate from prior ratings. The empirical estimate of ρ _i will inform us which scenario is more consistent with the data. In short, weight ρ _i is an indicator of how a rating correlates with past ratings. As long as later ratings contain information from past ratings, aggregation needs to weigh early and late reviews differently.

Appendix B: Notes on the data generating process

3.1 B.1 Data generating process

The model presented in Section 2.1 includes random change in restaurant quality, random noise in reviewer signal, reviewer heterogeneity in stringency, social incentives, and signal precision, and a quadratic time trend, as well as the quality of the match between the reviewer and the restaurant. Overall, one can consider the data generation process as the following three steps:

1.
Restaurant r starts with an initial quality μ _r0 when it is first reviewed on Yelp. Denote this time as time 0. Since time 0, restaurant quality μ _r evolves in a random walk process by calendar time, where an i.i.d. quality noise $\xi _{t}\sim N(0,\sigma _{\xi }^{2})$ is added on to restaurant quality at t so that μ _{r
t} = μ _r(t− 1) + ξ _t.
2.
A reviewer arrives at restaurant r at time t _n as r’s n ^th reviewer. She observes the attributes and ratings of all the previous n − 1 reviewers of r. She also obtains a signal $s_{rt_{n}}=\mu _{rt_{n}}+\epsilon _{rn}$ of the concurrent restaurant quality where the signal noise $\epsilon _{rn}\sim N\left (0,\sigma _{\epsilon }^{2}\right )$.
3.
The reviewer chooses an optimal rating that gives weights to both her own experience and her social incentives. The optimal rating takes the form
$$x_{rt_{n}}=\lambda_{rn}+\rho_{n}E(\mu_{rt_{n}}|x_{rt_{1}},x_{rt_{2}},..,x_{rt_{2}},...,s_{rt_{n}})+(1-\rho_{n})s_{rt_{n}} $$
where $E(\mu _{rt_{n}}|x_{rt_{1}},x_{rt_{2}},..,x_{rt_{2}},...,s_{rt_{n}})$ is the best guess of the restaurant quality at t _n by Bayesian updating.
4.
The reviewer is assumed to know the attributes of all past reviewers so that she can de-bias the stringency of past reviewers. The reviewer also knows that the general population of reviewers may change taste from year to year (captured in year fixed effects {α _{y
e
a
r
t}}), and there is a quadratic trend in λ by restaurant age (captured in {α _{a
g
e1},α _{a
g
e2}}). This trend could be driven by changes in reviewer stringency or restaurant quality and these two drivers are not distinguishable in the above expression for $x_{rt_{n}}$.

In the Bayesian estimate of $E(\mu _{rt_{n}}|x_{rt_{1}},x_{rt_{2}},..,x_{rt_{2}},...,s_{rt_{n}})$, we assume the n ^th reviewer of r is fully rational and has perfect information about the other reviewers’ observable attributes, which according to our model determines the other reviewers’ stringency (λ), social preference (ρ), and signal noise (σ _𝜖). With this knowledge, the n ^th reviewer of r can back out each reviewer’s signal before her; thus the Bayesian estimate of $E(\mu _{rt_{n}}|x_{rt_{1}},x_{rt_{2}},..,x_{rt_{2}},...,s_{rt_{n}})$ can be rewritten as $E(\mu _{rt_{n}}|s_{rt_{1}},...s_{rt_{n}})$. Typical Bayesian inference implies that a reviewer’s posterior about restaurant quality is a weighted average of previous signals and her own signal, with the weight increasing with signal precision. This is complicated by the fact that restaurant quality evolves by a martingale process, and therefore current restaurant quality is better reflected in recent reviews. Accordingly, the Bayesian estimate of $E(\mu _{rt_{n}}|s_{rt_{1}},...s_{rt_{n}})$ should give more weight to more recent reviews even if all reviewers have the same stringency, social preference, and signal precision. The analytical derivation of $E(\mu _{rt_{n}}|s_{rt_{1}},...s_{rt_{n}})$ is presented in Appendix A.

3.2 B.2 Deriving $\frac {@@}{@@}E(\mu _{rt}|s_{rt_{1}},...s_{rt_{n}})$

For restaurant r, denote the prior belief of $\mu _{rt_{n}}$ right before the realization of the n ^th signal as

$$\pi_{n|n-1}(\mu_{rt_{n}})=f(\mu_{rt_{n}}|s_{rt_{1}},...s_{rt_{n-1}}) $$

and we assume that the first reviewer uses an uninformative prior

$$\mu_{1|0}= 0,\sigma_{1|0}^{2}=W,\ W\ arbitrarily\ large $$

Denote the posterior belief of $\mu _{rt_{n}}$ after observing $s_{rt_{n}}$ as

$$h_{n|n}(\mu_{rt_{n}})=f(\mu_{rt_{n}}|s_{rt_{1}},...s_{rt_{n}}) $$

Hence

$$\begin{array}{@{}rcl@{}} h_{n|n}(\mu_{rt_{n}})\,=\,f(\mu_{rt_{n}}|s_{rt_{1}},...s_{rt_{n}}) & = & \frac{f(\mu_{rt_{n}},s_{rt_{1}},...s_{rt_{n}})}{f(s_{rt_{1}},...s_{rt_{n}})}\\ & \propto & f(\mu_{rt_{n}},s_{rt_{1}},...s_{rt_{n}})\\ & = & f(s_{rt_{n}}|\mu_{rt_{n}},s_{rt_{1}},...s_{rt_{n-1}})f(\mu_{rt_{n}},s_{rt_{1}},...s_{rt_{n-1}})\\ & = & f(s_{rt_{n}}|\mu_{rt_{n}},s_{rt_{1}},...s_{rt_{n-1}})f(\mu_{rt_{n}}|s_{rt_{1}},...s_{rt_{n-1}})f(s_{rt_{1}},...s_{rt_{n-1}})\\ & \propto & f(s_{rt_{n}}|\mu_{rt_{n}})f(\mu_{rt_{n}}|s_{rt_{1}},...s_{rt_{n-1}})\\ & = & f(s_{rt_{n}}|\mu_{rt_{n}})\pi_{n|n-1}(\mu_{rt_{n}}) \end{array} $$

where $f(s_{rt_{n}}|\mu _{rt_{n}},s_{rt_{1}},...s_{rt_{n-1}})=f(s_{rt_{n}}|\mu _{rt_{n}})$ comes from the assumption that $s_{rt_{n}}$ is independent of past signals conditional on $\mu _{rt_{n}}$.

In the above formula, the prior belief of $\mu _{rt_{n}}$ given the realization of $\{s_{rt_{1}},...,s_{rt_{n-1}}\}$, or $\pi _{n|n-1}(\mu _{rt_{n}})$, depends on the posterior belief of $\mu _{rt_{n-1}}$, $h_{n-1|n-1}(\mu _{rt_{n-1}})$ and the evolution process from $\mu _{rt_{n-1}}$ to $\mu _{rt_{n}}$, denoted as g(μ _n|μ _n− 1). Hence,

$$\pi_{n|n-1}(\mu_{rt_{n}})=g(\mu_{n}|\mu_{n-1})f(\mu_{rt_{n-1}}|s_{rt_{1}},...s_{rt_{n-1}})=g(\mu_{n}|\mu_{n-1})h_{n-1|n-1}(\mu_{rt_{n-1}}) $$

Given the normality of π _n|n− 1, $f(s_{rt_{n}}|\mu _{rt_{n}})$ and g(μ _n|μ _n− 1), $h_{n|n}(\mu _{rt_{n}})$ is distributed normal. In addition, denote μ _n|n and $\sigma _{n|n}^{2}$ as the mean and variance for random variable with normal probability density function $p_{n|n-1}(\mu _{rt_{n}})$, μ _n|n− 1 and $\sigma _{n|n-1}^{2}$ are the mean and variance of random variable with normal pdf $h_{n|n}(\mu _{rt_{n}})$. After combining terms in the derivation of $p_{n|n-1}(\mu _{rt_{n}})$ and $h_{n|n}(\mu _{rt_{n}})$, the mean and variance evolves according to the following rule:

$$\begin{array}{@{}rcl@{}} \mu_{n|n} &=& \mu_{n|n-1}+\frac{\sigma_{n|n-1}^{2}}{\sigma_{n|n-1}^{2}+{\sigma_{n}^{2}}}(s_{n}-\mu_{n|n-1})\\ &=& \frac{\sigma_{n|n-1}^{2}}{\sigma_{n|n-1}^{2}+{\sigma_{n}^{2}}}s_{n}+\frac{{\sigma_{n}^{2}}}{\sigma_{n|n-1}^{2}+{\sigma_{n}^{2}}}\mu_{n|n-1}\\ \sigma_{n|n}^{2} &=& \frac{{\sigma_{n}^{2}}\sigma_{n|n-1}^{2}}{\sigma_{n|n-1}^{2}+{\sigma_{n}^{2}}}\\ \mu_{n + 1|n} &=& \mu_{n|n}\\ \sigma_{n + 1|n}^{2} &=& \sigma_{n|n}^{2}+(t_{n + 1}-t_{n})\sigma_{\xi}^{2} \end{array} $$

Hence, we can deduct the beliefs from the initial prior,

$$\begin{array}{@{}rcl@{}} \mu_{1|0} & =& 0\\ \sigma_{1|0}^{2} & =& W>0\ and\ arbitrarily\ large\\ \mu_{1|1} & =& s_{1}\\ \sigma_{1|1}^{2} & =& {\sigma_{1}^{2}}\\ \mu_{2|1} & =& s_{1}\\ \sigma_{2|1}^{2} & =& {\sigma_{1}^{2}}+(t_{2}-t_{1})\sigma_{\xi}^{2}\\ \mu_{2|2} &=& \frac{{\sigma_{1}^{2}}+(t_{2}-t_{1})\sigma_{\xi}^{2}}{{\sigma_{1}^{2}}+{\sigma_{2}^{2}}+(t_{2}-t_{1})\sigma_{\xi}^{2}}s_{2}+\frac{{\sigma_{2}^{2}}}{{\sigma_{1}^{2}}+{\sigma_{2}^{2}}+(t_{2}-t_{1})\sigma_{\xi}^{2}}s_{1}\\ \sigma_{2|2}^{2} &=& \frac{{\sigma_{2}^{2}}\left( {\sigma_{1}^{2}}+(t_{2}-t_{1})\sigma_{\xi}^{2}\right)}{{\sigma_{1}^{2}}+{\sigma_{2}^{2}}+(t_{2}-t_{1})\sigma_{\xi}^{2}}\\ \mu_{3|2} & =& \mu_{2|2}\\ \sigma_{3|2}^{2} &=& \frac{{\sigma_{2}^{2}}\left( {\sigma_{1}^{2}}+(t_{2}-t_{1})\sigma_{\xi}^{2}\right)}{{\sigma_{1}^{2}}+{\sigma_{2}^{2}}+(t_{2}-t_{1})\sigma_{\xi}^{2}}+(t_{3}-t_{2})\sigma_{\xi}^{2}\\ &...& \end{array} $$

$E(\mu _{rt_{n}}|s_{rt_{1}},...s_{rt_{n}})=\mu _{n|n}$ is derived recursively following the above formulation.

3.3 B.3 The correlation of ratings induced by quality change

We assume quality evolution follows a martingale process: μ _{r
t} = μ _r(t− 1) + ξ _t, where t denotes the units of calendar time since restaurant r has first been reviewed and the t-specific evolution ξ _t conforms to $\xi _{t}\sim i.i.d\ N\left (0,\sigma _{\xi }^{2}\right )$. This martingale process introduces a positive correlation of restaurant quality over time,

$$\begin{array}{@{}rcl@{}} Cov(\mu_{rt},\mu_{rt^{\prime}}) &=&E\left( \mu_{r0}+\sum\limits_{\tau= 1}^{t}\xi_{\tau}-E(\mu_{rt})\right)\left( \mu_{r0}+\sum\limits_{\tau= 1}^{t^{\prime}}\xi_{\tau}-E(\mu_{rt^{\prime}})\right)\\ &=&E\left( \sum\limits_{\tau= 1}^{t}\xi_{\tau}\sum\limits_{\tau= 1}^{t^{\prime}}\xi_{\tau}\right)=\sum\limits_{\tau= 1}^{t}E\left( \xi_{\tau}^{2}\right)\ if\ t<t^{\prime}, \end{array} $$

which increases with the timing of the earlier date (t) but is independent of the time between t and t ^′.

Recall that $x_{rt_{n}}$ is the n ^th review written at time t _n since r was first reviewed. We can express the n ^th reviewer’s signal as:

$$\begin{array}{@{}rcl@{}} s_{rt_{n}} &=& \mu_{rt_{n}}+\epsilon_{rn}\\ where\ \ \ \mu_{rt_{n}} &=& \mu_{rt_{n-1}}+\xi_{t_{n-1}+ 1}+\xi_{t_{n-1}+ 2}+...+\xi_{t_{n}.} \end{array} $$

Signal noise 𝜖 _{r
n} is assumed to be i.i.d. with $Var(s_{rt_{n}}|\mu _{rt_{n}})={\sigma _{i}^{2}}$ where i is the identity of the n ^th reviewer. The variance of restaurant quality at t _n conditional on quality at t _n− 1 is,

$$Var(\mu_{rt_{n}}|\mu_{rt_{n-1}}) = Var(\xi_{t_{n-1}+ 1}+\xi_{t_{n-1}+ 2}+...+\xi_{t_{n}})=(t_{n}-t_{n-1})\sigma_{\xi}^{2}={\Delta} t_{n}\sigma_{\xi}^{2}. $$

Note that the martingale assumption entails two features in the stochastic process: first, conditional on $\mu _{rt_{n-1}}$, $\mu _{rt_{n}}$ is independent of the past signals $\{s_{rt_{1}},...,s_{rt_{n-1}}\}$; second, conditional on $\mu _{rt_{n}}$, $s_{rt_{n}}$ is independent of the past signals $\{s_{rt_{1}},...,s_{rt_{n-1}}\}$. These two features greatly facilitate reviewer n’s Bayesian estimate of restaurant quality.

Appendix C: Deriving the likelihood function

4.1 C.1 Deriving the likelihood function $\frac {@@}{@@}f(x_{rt_{2}}-x_{rt_{1}},...,x_{rt_{N_{r}}}-x_{rt_{N_{r}-1}})$

Because the covariance structure of $\{x_{rt_{2}}-x_{rt_{1}},x_{rt_{3}}-x_{rt_{2}},...,x_{rt_{N_{r}}}-x_{rt_{N_{r}-1}}\}$ is complicated, we use the change of variable technique to express the likelihood $f(x_{rt_{2}}-x_{rt_{1}},...,x_{rt_{N_{r}}}-x_{rt_{N_{r}-1}})$ by $f(s_{rt_{2}}-s_{rt_{1}},...,s_{rt_{N_{r}}}-s_{rt_{N_{r}-1}})$,

$$f(x_{rt_{2}}-x_{rt_{1}},...,x_{rt_{N_{r}}}-x_{rt_{N_{r}-1}})=|J_{\Delta s\rightarrow{\Delta} x}|^{-1}f(s_{rt_{2}}-s_{rt_{1}},...,s_{rt_{N_{r}}}-s_{rt_{N_{r}-1}}). $$

The derivation of $f(x_{rt_{2}}-x_{rt_{1}},...,x_{rt_{N_{r}}}-x_{rt_{N_{r}-1}})$ is shown as the following,

Step 1: To derive $f(s_{rt_{2}}-s_{rt_{1}},...,s_{rt_{N_{r}}}-s_{rt_{N_{r}-1}})$, we note that $s_{rt_{n}}=\mu _{rt_{n}}+\epsilon _{n}$ and thus, for any m > n, n ≥ 2, the variance and covariance structure can be written as:
$$\begin{array}{@{}rcl@{}} &&Cov(s_{rt_{n}}-s_{rt_{n-1}},s_{rt_{m}}-s_{rt_{m-1}})\\ &&= Cov(\epsilon_{rn}-\epsilon_{rn-1}+\xi_{t_{n-1}+ 1}+...+\xi_{t_{n}},\epsilon_{rm}-\epsilon_{rm-1}+\xi_{t_{m-1}+ 1}+...+\xi_{t_{m}})\\ &&= \left\{\begin{array}{ll} -\sigma_{rn}^{2} & if\ m=n + 1\\ 0 & if\ m>n + 1 \end{array}\right.\\ && Var(s_{rt_{n}}-s_{rt_{n-1}})\\ &&= \sigma_{rn}^{2}+\sigma_{rn-1}^{2}+(t_{n}-t_{n-1})\sigma_{\xi}^{2}. \end{array} $$
Denoting the total number of reviewers on restaurant r as N _r, the vector of the first differences of signals as ${\Delta }s_{r}=\{s_{rt_{n}}-s_{rt_{n-1}}\}_{n = 2}^{N_{r}}$, and its covariance variance structure as ${\Sigma }_{\Delta s_{r}}$, we have
$$f({\Delta} s_{r})=(2\pi)^{-\frac{N_{r}-1}{2}}|{\Sigma}_{\Delta s_{r}}|^{-(N_{r}-1)/2}exp\left( -\frac{1}{2}{\Delta} s_{r}^{\prime}{\Sigma}_{\Delta s_{r}}^{-1}{\Delta} s_{r}\right). $$
Step 2: We derive the value of $\{s_{rt},...s_{rt_{N_{r}}}\}_{r = 1}^{R}$ from observed ratings $\{x_{rt_{1}},...x_{rt_{N_{r}}}\}_{r = 1}^{R}$. Given
$$x_{rt_{n}}=\lambda_{rn}+\rho_{n}E(\mu_{rt_{n}}|s_{rt_{1}},...s_{rt_{n}})+(1-\rho_{n})s_{rt_{n}} $$
and $E(\mu _{rt_{n}}|s_{rt},...s_{rt_{n}})$ as a function of $\{s_{rt_{1}},...s_{rt_{n}}\}$ (formula in Appendix A), we can solve $\{s_{rt_{1}},...s_{rt_{n}}\}$ from $\{x_{rt_{1}},...x_{rt_{n}}\}$ according to the recursive formula in Appendix A.
Step 3: We derive |J _Δs→Δx|^− 1 or |J _Δx→Δs|, where J _Δx→Δs is such that
$$\left[\begin{array}{c} s_{rt_{2}}-s_{rt_{1}}\\ ...\\ s_{rt_{n}}-s_{rt_{n-1}} \end{array}\right]=J_{\Delta x\rightarrow{\Delta} s}\left[\begin{array}{c} x_{rt_{2}}-x_{rt_{1}}\\ ...\\ x_{rt_{n}}-x_{rt_{n-1}} \end{array}\right] $$
the analytical form of J _Δx→Δs is available given the recursive expression for $x_{rt_{n}}$ and $s_{rt_{n}}$.

4.2 C.2 Solving $\frac {@@}{@@}\{s_{rt_{1}},...s_{rt_{n}}\}$ from observed ratings

Solve $\{s_{rt_{1}},...s_{rt_{n}}\}$ from $\{x_{rt_{1}},...x_{rt_{n}}\}$ according to the following recursive formula:

$$\begin{array}{@{}rcl@{}} x_{1} &=& s_{1}+\lambda_{1}\\ s_{1} &=& x_{1}-\lambda_{1}\\ \end{array} $$

$$\begin{array}{@{}rcl@{}} x_{2} &=&\rho_{2}\frac{{\sigma_{2}^{2}}}{\sigma_{2|1}^{2}+{\sigma_{2}^{2}}}\mu_{2|1}+\rho_{2}\frac{\sigma_{2|1}^{2}}{\sigma_{2|1}^{2}+{\sigma_{2}^{2}}}s_{2}+(1-\rho_{2})s_{2}+\lambda_{2}\\ &=&\rho_{2}\frac{{\sigma_{2}^{2}}}{\sigma_{2|1}^{2}+{\sigma_{2}^{2}}}\mu_{2|1}+\left[1-\left( 1-\frac{\sigma_{2|1}^{2}}{\sigma_{2|1}^{2}+{\sigma_{2}^{2}}}\right)\rho_{2}\right]s_{2}+\lambda_{2}\\ s_{2}&=&\frac{1}{\left[1-\left( 1-\frac{\sigma_{2|1}^{2}}{\sigma_{2|1}^{2}+{\sigma_{2}^{2}}}\right)\rho_{2}\right]}\left[x_{2}-\lambda_{2}-\rho_{2}\frac{{\sigma_{2}^{2}}}{\sigma_{2|1}^{2}+{\sigma_{2}^{2}}}\mu_{2|1}\right]\\ &...&\\ s_{n} &=&\frac{1}{\left[1-\left( 1-\frac{\sigma_{n|n-1}^{2}}{\sigma_{n|n-1}^{2}+{\sigma_{n}^{2}}}\right)\rho_{n}\right]}\left[x_{n}-\lambda_{n}-\rho_{n}\frac{{\sigma_{n}^{2}}}{\sigma_{n|n-1}^{2}+{\sigma_{n}^{2}}}\mu_{n|n-1}\right]. \end{array} $$

Appendix D: Tables

Table 7 What explains the variance of yelp ratings?

Full size table

Table 8 Variability of ratings declines over time

Full size table

Table 9 Examine serial correlation in restaurant ratings

Full size table

Table 10 Does matching improve over time?

Full size table

Table 11 Baseline model with different quality update frequency assumptions

Full size table

Table 12 Estimation results: limited attention model vs. bayesian rational model

Full size table

Table 13 Characteristics of review with different simple and adjusted averages gaps

Full size table

Appendix E: Figures

Appendix F: “Restaurant reviews beliefs survey” questionnaire

To test our model against external source of information, we conducted an online survey using Amazon Mturk (“Restaurant Reviews Beliefs Survey,” February 1, 2016) in which we asked how respondents used and comprehended restaurant ratings online (we didn’t mention Yelp in the survey). In total, 239 Mturk workers responded to our survey. The following shows the screen shot of the questionnaire.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dai, W.(., Jin, G., Lee, J. et al. Aggregation of consumer ratings: an application to Yelp.com. Quant Mark Econ 16, 289–339 (2018). https://doi.org/10.1007/s11129-017-9194-9

Download citation

Received: 01 April 2017
Accepted: 13 November 2017
Published: 29 December 2017
Issue Date: September 2018
DOI: https://doi.org/10.1007/s11129-017-9194-9

Keywords

JEL Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Aggregation of consumer ratings: an application to Yelp.com

Abstract

Access this article

Similar content being viewed by others

A Lot of Slots – Outliers Confinement in Review-Based Systems

A Time and Opinion Quality-Weighted Model for Aggregating Online Reviews

The Recommendation Bias: The Effects of Social Influence on Individual Rating Behavior

Notes

References