Abstract
Crowd-sourced content in the form of online product reviews or recommendations is an integral feature of most Internet-based service platforms and marketplaces, including Yelp, TripAdvisor, Netflix, and Amazon. Customers may find such information useful when deciding between potential alternatives; at the same time, the process of generating such content is mainly driven by the customers’ decisions themselves. In other words, the service platform or marketplace “explores” the set of available options through its customers’ decisions, while they “exploit” the information they obtain from the platform about past experiences to determine whether and what to purchase. Unlike the extensive work on the trade-off between exploration and exploitation in the context of multi-armed bandits, the canonical framework we discuss in this chapter involves a principal that explores a set of options through the actions of self-interested agents. In this framework, the incentives of the principal and the agents towards exploration are misaligned, but the former can potentially incentivize the actions of the latter by appropriately designing a payment scheme or an information provision policy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Kleinberg and Slivkins (2017) also presented recently a comprehensive tutorial related to these issues.
- 2.
- 3.
Our analysis can be readily extended to the case of more than two providers.
- 4.
The probability density function of a Beta(s, f) random variable is given by
$$\displaystyle \begin{aligned}g(x;s,f)=\frac{x^{s-1}(1-x)^{f-1}}{B(s,f)},\text{ for }x\in[0,1].\end{aligned}$$ - 5.
The platform and the customers hold the same prior belief, so that platform actions (e.g., the choice of an information-provision policy) do not convey any additional information on provider quality to the customers (e.g., Bergemann and Välimäki 1997; Bose et al. 2006; Papanastasiou and Savva 2017).
- 6.
Commitment is a reasonable assumption in the context of online platforms, where information provision occurs on the basis of pre-decided algorithms and the large volume of products/services hosted renders ad-hoc adjustments of the automatically-generated content prohibitively costly.
- 7.
The generic term “message” refers to a specific configuration of information that is observed by the customer; examples of messages include detailed outcome histories (i.e., distributions of customer reviews), relative rankings of providers, or recommendations for a specific product.
- 8.
More generally, our analysis is relevant for cases where the platform has a different (e.g., longer-run) objective than its users.
- 9.
Note that for the case of a Bernoulli reward process the current probability of success (i.e., the Bayesian probability of the next trial being a success given the current state of the system) is equal to the immediate expected reward, r(x t, i) (e.g., Gittins et al. 2011).
- 10.
This expectation can be computed by the period-t customer, since the ex ante probability that the state in period t is x t (i.e., unconditional on the message g(x t)) is known to the customer through her knowledge of the designer’s policy in previous periods and the preceding customers’ best response to this policy.
- 11.
The result of Proposition 1 extends readily to the case of |S| = n providers (in this case, an ICRP consists of n possible recommendations, and each recommendation must satisfy n − 1 IC constraints per period), as well as to alternative platform objective functions (by replacing r(k, i) with suitable reward functions).
- 12.
Note that the solution to LP (10.4) can also be used to retrieve the period-t customer’s belief over the system state upon entry to the platform; specifically, this belief is given by \(P(x_t=z)={\sum _{i\in S}\rho (z,i)}/ ({\sum _{k\in X_t}\sum _{i\in S}\rho (k,i)})\).
- 13.
This is a natural generalization of the computation in the example of Sect. 10.3.
- 14.
Che and Horner (2017) also consider the problem of optimally designing recommendation policies in a setting where information about the quality of two potential alternatives arrives continuously over time—their setting uses the exponential bandit framework of Keller et al. (2005) as a building block.
- 15.
However, the analysis may, in general, be quite challenging.
- 16.
The NetFlix Prize offered a million dollars to anyone who succeeded in improving the company’s recommendation algorithm by a certain margin and was concluded in 2009. The Heritage Prize was a multi-year contest whose goal was to provide an algorithm that predicts patient readmissions to hospitals. A successful breakthrough was obtained in 2013.
- 17.
In addition to the work that we discuss here, which mainly focuses on the dynamics of learning and competition in contests, there is also an extensive body of work that explore a number of questions in a static framework, e.g., Terwiesch and Xu (2008), Ales et al. (2017), and Körpeoğlu and Cho (2017).
- 18.
- 19.
- 20.
- 21.
- 22.
References
Acemoglu D, Dahleh MA, Lobel I, Ozdaglar A (2011) Bayesian learning in social networks. Rev Econ Stud 78(4):1201–1236
Acemoglu D, Bimpikis K, Ozdaglar A (2014) Dynamics of information exchange in endogenous social networks. Theor Econ 9(1):41–97
Ales L, Cho SH, Körpeoğlu E (2017, Forthcoming) Optimal award scheme in innovation tournaments. Oper Res 65(3):693–702
Allon G, Zhang DJ (2017) Managing service systems in the presence of social networks. Working paper
Allon G, Bassamboo A, Gurvich I (2011) “We will be right with you”: managing customer expectations with vague promises and cheap talk. Oper Res 59(6):1382–1394
Altman E (1999) Constrained Markov decision processes. CRC Press, Boca Raton
Balseiro SR, Feldman J, Mirrokni V, Muthukrishnan S (2014) Yield optimization of display advertising with ad exchange. Manag Sci 60(12):2886–2907
Balseiro SR, Besbes O, Weintraub GY (2015) Repeated auctions with budgets in ad exchanges: approximations and design. Manag Sci 61(4):864–884
Banerjee A (1992) A simple model of herd behavior. Q J Econ 107(3):797–817
Bergemann D, Välimäki J (1997) Market diffusion with two-sided learning. RAND J Econ 28(4):773–795
Bertsimas D, Mersereau A (2007) A learning approach for interactive marketing to a customer segment. Oper Res 55(6):1120–1135
Besbes O, Scarsini M (2017, Forthcoming) On information distortions in online ratings. Oper Res 66(3):597–610
Bikhchandani S, Hirshleifer D, Welch I (1992) A theory of fads, fashion, custom, and cultural change as informational cascades. J Polit Econ 100(5):992–1026
Bimpikis K, Drakopoulos K (2016) Disclosing information in strategic experimentation. Working paper
Bimpikis K, Candogan O, Saban D (2017a) Spatial pricing in ride-sharing networks. Working paper
Bimpikis K, Elmaghraby WJ, Moon K, Zhang W (2017b) Managing market thickness in online B2B markets. Working paper
Bimpikis K, Ehsani S, Mostagir M (2018, Forthcoming) Designing dynamic contests. Oper Res
Bolton P, Harris C (1999) Strategic experimentation. Econometrica 67(2):349–374
Bose S, Orosel G, Ottaviani M, Vesterlund L (2006) Dynamic monopoly pricing and herding. RAND J Econ 37(4):910–928
Cachon GP, Daniels KM, Lobel R (2017) The role of surge pricing on a service platform with self-scheduling capacity. Manuf Serv Oper Manag 19(3):368–384
Candogan O, Drakopoulos K (2017) Optimal signaling of content accuracy: engagement vs. misinformation. Working paper
Caro F, Gallien J (2007) Dynamic assortment with demand learning for seasonal consumer goods. Manag Sci 53(2):276–292
Che YK, Horner J (2017) Recommender systems as incentives for social learning. Working paper
Crapis D, Ifrach B, Maglaras C, Scarsini M (2017) Monopoly pricing in the presence of social learning. Manag Sci 63(11):3586–3608
Crawford V, Sobel J (1982) Strategic information transmission. Econometrica 50(6):1431–1451
Debo L, Parlour C, Rajan U (2012) Signaling quality via queues. Manag Sci 58(5):876–891
Ely JC (2017) Beeps. Am Econ Rev 107(1):31–53
Feldman P, Papanastasiou Y, Segev E (2018, Forthcoming) Social learning and the design of new experience goods. Manag Sci
Frazier P, Kempe D, Kleinberg J, Kleinberg R (2014) Incentivizing exploration. In: Proceedings of the 15th ACM conference on economics and computation, Palo Alto. ACM, pp 5–22
Girotra K, Terwiesch C, Ulrich KT (2010) Idea generation and the quality of the best idea. Manag Sci 56(4):591–605
Gittins J, Jones D (1974) A dynamic allocation index for the sequential design of experiments. Progress in statistics, pp 241–266. Read at the 1972 European Meeting of Statisticians, Budapest
Gittins J, Glazebrook K, Weber R (2011) Multi-armed bandit allocation indices. Wiley, Chichester
Halac M, Kartik N, Liu Q (2017) Contests for experimentation. J Polit Econ 125(5):1523–1569
Hörner J, Skrzypacz A (2016) Learning, experimentation and information design. Survey Prepared for the 2015 econometric summer meetings in Montreal
Hu M, Wang L (2017) Joint vs. separate crowdsourcing contests. Working paper
Hu M, Shi M, Wu J (2013) Simultaneous vs. sequential group-buying mechanisms. Manag Sci 59(12):2805–2822
Huang Y, Vir Singh P, Srinivasan K (2014) Crowdsourcing new product ideas under consumer learning. Manag Sci 60(9):2138–2159
Jiang ZZ, Huang Y, Beil DR (2016) The role of feedback in dynamic crowdsourcing contests: a structural empirical analysis. Working paper
Kamenica E, Gentzkow M (2011) Bayesian persuasion. Am Econ Rev 101(6):2590–2615
Kanoria Y, Saban D (2017) Facilitating the search for partners on matching platforms: restricting agent actions. Working paper
Keller G, Rady S, Cripps M (2005) Strategic experimentation with exponential bandits. Econometrica 73(1):39–68
Kleinberg RD, Slivkins A (2017) Tutorial: incentivizing and coordinating exploration. In: Proceedings of the 18th ACM conference on economics and computation, Cambridge
Kornish LJ, Ulrich KT (2011) Opportunity spaces in innovation: empirical analysis of large samples of ideas. Manag Sci 57(1):107–128
Körpeoğlu E, Cho SH (2017, Forthcoming) Incentives in contests with heterogeneous solvers. Manag Sci 64:2709–2715
Kremer I, Mansour Y, Perry M (2014) Implementing the “wisdom of the crowd.” J Polit Econ 122(5):988–1012
Li J, Netessine S (2017) Market thickness and matching (in) efficiency: evidence from a quasi-experiment. Working paper
Lobel I, Sadler E (2015) Preferences, homophily, and social learning. Oper Res 64(3):564–584
Mansour Y, Slivkins A, Syrgkanis V, Wu ZSW (2015) Bayesian exploration: incentivizing exploration in Bayesian games. In: Proceedings of the 16th ACM conference on economics and computation, Portland. ACM, pp 565–582
Marinesi S, Girotra K, Netessine S (2017, Forthcoming) The operational advantages of threshold discounting offers. Manag Sci 64:2690–2708
Moon K, Bimpikis K, Mendelson H (2017, Forthcoming) Randomized markdowns and online monitoring. Manag Sci 64:1271–1290
Orlov D, Skrzypacz A, Zryumov P (2017) Persuading the principal to wait. Working paper
Papanastasiou Y (2017) Fake news propagation and detection: a sequential model. Working paper
Papanastasiou Y, Savva N (2017) Dynamic pricing in the presence of social learning and strategic consumers. Manag Sci 63(4):919–939
Papanastasiou Y, Bimpikis K, Savva N (2017, Forthcoming) Crowdsourcing exploration. Manag Sci 64:1727–1746
Rayo L, Segal I (2010) Optimal information disclosure. J Polit Econ 118(5):949–987
Renault J, Solan E, Vieille N (2017) Optimal dynamic information provision. Games Econ Behav 104:329–349
Seel C, Strack P (2016) Continuous time contests with private information. Math Oper Res 41(3):1093–1107
Strack P (2016) Risk-taking in contests: the impact of fund-manager compensation on investor welfare. Working paper
Swinney R (2011) Selling to strategic consumers when product value is uncertain: the value of matching supply and demand. Manag Sci 57(10):1737–1751
Taylor T (2018) On-demand service platforms. Manuf Serv Oper Manag 20(4):704–720
Terwiesch C, Xu Y (2008) Innovation contests, open innovation, and multiagent problem solving. Manag Sci 54(9):1529–1543
Veeraraghavan S, Debo L (2009) Joining longer queues: information externalities in queue choice. Manuf Serv Oper Manag 11(4):543–562
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Bimpikis, K., Papanastasiou, Y. (2019). Inducing Exploration in Service Platforms. In: Hu, M. (eds) Sharing Economy. Springer Series in Supply Chain Management, vol 6. Springer, Cham. https://doi.org/10.1007/978-3-030-01863-4_10
Download citation
DOI: https://doi.org/10.1007/978-3-030-01863-4_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01862-7
Online ISBN: 978-3-030-01863-4
eBook Packages: Business and ManagementBusiness and Management (R0)