Abstract
Unfair pricing policies have been shown to be one of the most negative perceptions customers can have concerning pricing, and may result in long-term losses for a company. Despite the fact that dynamic pricing models help companies maximize revenue, fairness and equality should be taken into account in order to avoid unfair price differences between groups of customers. This paper shows how to solve dynamic pricing by using Reinforcement Learning (RL) techniques so that prices are maximized while keeping a balance between revenue and fairness. We demonstrate that RL provides two main features to support fairness in dynamic pricing: on the one hand, RL is able to learn from recent experience, adapting the pricing policy to complex market environments; on the other hand, it provides a trade-off between short and long-term objectives, hence integrating fairness into the model’s core. Considering these two features, we propose the application of RL for revenue optimization, with the additional integration of fairness as part of the learning procedure by using Jain’s index as a metric. Results in a simulated environment show a significant improvement in fairness, while at the same time maintaining optimisation of revenue.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
\(\epsilon \) = 1.0. In null model all actions are randomly selected.
References
Deksnyte, I., Zigmas Lydeka, P.: Dynamic pricing and its forming factors. Int. J. Bus. Soc. Sci. 3(23) (2012)
Narahari, Y., Raju, C.V.L., Ravikumar, K., Shah, S.: Dynamic pricing models for electronic business. Sadhana 30(2), 231–256 (2005)
den Boer, A.V.: Dynamic pricing and learning: Historical origins, current research, and new directions. Surv. Oper. Res. Manag. Sci. 20(1), 1–18 (2015)
Adamy, J.: E-tailer price tailoring may be wave of future (1999). http://articles.chicagotribune.com/2000-09-25/business/0009250017_1_prices-amazon-spokesman-bill-curry-don-harter
Reinartz, W.: Customizing prices in online markets. Symphonya. Emerging Issues in Management, no. 1 Market-Space Management, 5 (2002)
Garbarino, E., Lee, O.F.: Dynamic pricing in internet retail: effects on consumer trust. Psychol. Mark. 20(6), 495–513 (2003)
Lee, S., Illia, A., LawsonBody, A.: Perceived price fairness of dynamic pricing. Ind. Manag. Data Syst. 111(4), 531–550 (2011)
Xia, L., Monroe, K.B., Cox, J.L.: The price is unfair! a conceptual framework of price fairness perceptions. J. Mark. 68(4), 1–15 (2004)
Weisstein, F.L., Monroe, K.B., Kukar-Kinney, M.: Effects of price framing on consumers’ perceptions of online dynamic pricing practices. J. Acad. Mark. Sci. 41(5), 501–514 (2013)
Haws, K.L., Bearden, W.O.: Dynamic pricing and consumer fairness perceptions. J. Consum. Res. 33(3), 304–311 (2006). D. I. served as editor, and E. A. served as associate editor for this article
Odlyzko, A.: Privacy, economics, and price discrimination on the internet. In: Proceedings of the 5th International Conference on Electronic Commerce, ICEC 2003, pp. 355–366. ACM (2003)
Kimes, S.E.: A retrospective commentary on discounting in the hotel industry: a new approach. Cornell Hotel. Restaur. Adm. Q. 43(4), 92–93 (2002)
Kahneman, D., Knetsch, J., Thaler, R.: Fairness as a constraint on profit seeking: entitlements in the market. Am. Econ. Rev. 76(4), 728–41 (1986)
Finkel, N.J.: Not Fair! The Typology of Commonsense Unfairness, 1st ed. American Psychological Association (APA) (2001)
Kutschinski, E., Uthmann, T., Polani, D.: Learning competitive pricing strategies by multi-agent reinforcement learning. J. Econ. Dyn. Control. 27(11), 2207–2218 (2003)
Knnen, V.: Dynamic pricing based on asymmetric multiagent reinforcement learning. Int. J. Intell. Syst. 21(1), 73–98 (2006)
Gupta, M., Ravikumar, K., Kumar, M.: Adaptive strategies for price markdown in a multi-unit descending price auction: a comparative study. In: IEEE International Conference on Systems, Man and Cybernetics, vol. 1, pp. 373–378 (2002)
Menon, R.B., Menon, S.B., Srinivasan, D., Jain, L.: Online reinforcement learning in multi-agent systems for distributed energy systems. In IEEE Innovative Smart Grid Technologies - Asia (ISGT ASIA), pp. 791–796 (2014)
Skirpan, M., Gorelick, M.: The authority of “fair” in machine learning. CoRR vol. abs/1706.09976 (2017)
Burrell, J.: How the machine thinks: understanding opacity in machine learning algorithms. Big Data Soc. 3(1) (2016)
Bostrom, N.: Superintelligence: Paths, Dangers, Strategies, 1st ed. Oxford University Press (2014)
Cerquitelli, T., Quercia, D., Pasquale, F.: Transparent Data Mining for Big and Small Data, 1st edn. Springer Publishing Company, Incorporated (2017)
Mikians, J., Gyarmati, L., Erramilli, V., Laoutaris, N.: Detecting price and search discrimination on the internet. In: Proceedings of the 11th ACM Workshop on Hot Topics in Networks, HotNets-XI, pp. 79–84. ACM (2012)
Bolukbasi, T., Chang, K.-W., Zou, J.Y. Saligrama, V., Kalai, A.T.: Man is to computer programmer as woman is to homemaker? debiasing word embeddings. In: Advances in Neural Information Processing Systems 29, pp. 4349–4357. Curran Associates, Inc. (2016)
Caliskan, A., Bryson, J.J., Narayanan, A.: Semantics derived automatically from language corpora contain human-like biases. Science 356(6334), 183–186 (2017)
Hardt, M., Price, E., Srebro, N.: Equality of opportunity in supervised learning. CoRR, vol. abs/1610.02413 (2016)
Ryu, H.J., Mitchell, M., Adam, H.: Improving smiling detection with race and gender diversity. ArXiv e-prints (2017)
Weiss, R., Mehrotra, A.: Online dynamic pricing: efficiency, equity and the future of E-commerce. Va. J. Law Technol. 6(2) (2001)
Lan, T., Kao, D.T.H., Chiang, M., Sabharwal, A.: An axiomatic theory of fairness. CoRR, vol. abs/0906.0557 (2009)
Arianpoo, N., Leung, V.C.: How network monitoring and reinforcement learning can improve tcp fairness in wireless multi-hop networks. EURASIP J. Wirel. Commun. Netw. 2016(1), 278 (2016)
Sirajuddin, M., Rupa, C., Prasad, A.: Techniques for enhancing the performance of tcp in wireless networks. In: Suresh, L.P., Dash, S.S., Panigrahi, B.K. (eds.) Artificial Intelligence and Evolutionary Algorithms in Engineering Systems, pp. 159–167 (2015)
Zhang, X.M., Zhu, W.B., Li, N.N., Sung, D.K.: Tcp congestion window adaptation through contention detection in ad hoc networks. IEEE Trans. Veh. Technol. 59(9), 4578–4588 (2010)
Sutton, R.S., Barto, A.G.: Introduction to Reinforcement Learning, 1st ed. MIT Press (1998)
Watkins, C.J., Dayan, P.: Technical note: Q-learning. Mach. Learn. 8(3), 279–292 (1992)
Dini, S., Serrano, M.: Combining q-learning with artificial neural networks in an adaptive light seeking robot (2012)
van Hasselt, H., Wiering, M.A.: Reinforcement learning in continuous action spaces. In: IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning, pp. 272–279 (2007)
Hornik, K., Stinchcombe, M., White, H.: Multilayer feedforward networks are universal approximators. Neural Netw. 2(5), 359–366 (1989)
Riedmiller, M.: Neural fitted q iteration - first experiences with a data efficient neural reinforcement learning method. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds.) Machine Learning: ECML 2005, pp. 317–328. Springer, Heidelberg (2005)
Lin, L.-J.: Reinforcement learning for robots using neural networks. Ph.D. dissertation, School of Computer Science, uMI Order No. GAX93-22750 (1992)
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.: Playing atari with deep reinforcement learning. In: NIPS Deep Learning Workshop 2013 (2013). arxiv:1312.5602Comment
Rana, R., Oliveira, F.S.: Dynamic pricing policies for interdependent perishable products or services using reinforcement learning. Expert. Syst. Appl. 42(1), 426–436 (2015)
Jain, R., Durresi, A., Babic, G.: Throughput fairness index: an explanation. The Ohio State University, Technical report, February 2010
Phillips, R.: Pricing and Revenue Optimization, Stanford Business Books. Stanford University Press (2005)
Matignon, L., Laurent, G.J., Fort-piat, N.L.: Improving reinforcement learning speed for robot control. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 3172–3177 (2006)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. CoRR, vol. abs/1412.6980 (2014)
Bowie, N.E.: Organizational Integrity and Moral Climates, pp. 183–205. Springer (2013)
Salimans, T., Ho, J., Chen, X., Sutskever, I.: Evolution strategies as a scalable alternative to reinforcement learning. CoRR, vol. abs/1703.03864 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Maestre, R., Duque, J., Rubio, A., Arevalo, J. (2019). Reinforcement Learning for Fair Dynamic Pricing. In: Arai, K., Kapoor, S., Bhatia, R. (eds) Intelligent Systems and Applications. IntelliSys 2018. Advances in Intelligent Systems and Computing, vol 868. Springer, Cham. https://doi.org/10.1007/978-3-030-01054-6_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-01054-6_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01053-9
Online ISBN: 978-3-030-01054-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)