Optimal Transportation Network Company Vehicle Dispatching via Deep Deterministic Policy Gradient

Shi, Dian; Li, Xuanheng; Li, Ming; Wang, Jie; Li, Pan; Pan, Miao

doi:10.1007/978-3-030-23597-0_24

Optimal Transportation Network Company Vehicle Dispatching via Deep Deterministic Policy Gradient

Conference paper
First Online: 21 June 2019

2245 Accesses
4 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11604))

Abstract

With the popularity of smart phones and the maturity of civilian global positioning system (GPS) technology, transportation network company (TNC) services have become a prominent commute mode in many major cities, which can effectively pair the passengers with the TNC vehicles/drivers through mobile applications. However, given the growing number of TNC vehicles, how to efficiently dispatch TNC vehicles poses crucial challenges. In this paper, we propose a novel method for TNC vehicle dispatching in different areas of the city based on deep reinforcement learning (DRL) method with joint consideration of the TNC company, individual TNC vehicle, and customer/passenger. The proposed model optimizes the distribution of vehicles geographically to meet the customers’ demands, while improving the drivers’ profit. In particular, we consider the high dimensional state and action space in the urban city traffic dynamic environment, and develop a deep deterministic policy gradient, an actor-critic based DRL algorithm for dispatching vacant TNC vehicles. We leverage Didi Chuxing’s open data set to evaluate the performance of the proposed approach, and the simulation results show that the proposed approach improves the average income of the driver while satisfying the supply and demand relationship between TNC vehicles and customers/passengers.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

COMPANY D: Gaia initiative. https://outreach.didichuxing.com/research/opendata/en/. Accessed 4 Apr 2018
Gao, Y., Jiang, D., Xu, Y.: Optimize taxi driving strategies based on reinforcement learning. Int. J. Geogr. Inf. Sci. 32, 1677–1696 (2018)
Google Scholar
Han, M., Senellart, P., Bressan, S., Wu, H.: Routing an autonomous taxi with reinforcement learning. In: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management (CIKM), Indianapolis, IN, October 2016
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Li, L., Li, Y., Hou, R.: A novel mobile edge computing-based architecture for future cellular vehicular networks. In: 2017 IEEE Wireless Communications and Networking Conference (WCNC), pp. 1–6. IEEE (2017)
Google Scholar
Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529 (2015)
Google Scholar
Moran, M.: Transportation network companies (2016)
Google Scholar
Shi, D., et al.: Deep Q-network based route scheduling for TNC vehicles with passengers’ location differential privacy. IEEE Internet Things J. (2019)
Google Scholar
Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., Riedmiller, M.: Deterministic policy gradient algorithms. In: International Conference on Machine Learning (ICML), China, June 2014
Google Scholar
Silver, D., et al.: Mastering the game of go without human knowledge. Nature 550(7676), 354 (2017)
Google Scholar
Sun, Y., Peng, M., Mao, S.: Deep reinforcement learning based mode selection and resource management for green fog radio access networks. IEEE Internet Things J. 6, 1960–1971 (2018)
Google Scholar
Verma, T., Varakantham, P., Kraus, S., Lau, H.C.: Augmenting decisions of taxi drivers through reinforcement learning for improving revenues (2017)
Google Scholar

Download references

Acknowledgement

The work of D. Shi and M. Pan was supported in part by the U.S. National Science Foundation under grants US CNS-1350230 (CAREER), CNS-1646607, CNS-1702850, and CNS-1801925. The work of X. Li was supported in part by the National Natural Science Foundation of China under Grant 61801080, and the Fundamental Research Funds of Dalian University of Technology under Grant DUT18RC(3)012. The work of M. Li was supported by the U.S. National Science Foundation under grants CNS-1566634 and ECCS-1711991. The work of J. Wang was supported in part by the National Natural Science Foundation of China under grant 61671102, Liaoning Province Natural Science Foundation under grant 20180520026, Dalian Science and Technology Innovation Foundation under grant 2018J12GX044. and Dalian High-level Talent Innovation Support Program Project under grant 2017RQ096.

Author information

Authors and Affiliations

University of Houston, Houston, TX, 77204, USA
Dian Shi & Miao Pan
Dalian University of Technology, Dalian, China
Xuanheng Li & Jie Wang
University of Texas at Arlington, Arlington, TX, 76019, USA
Ming Li
Case Western Reserve University, Cleveland, OH, 44106, USA
Pan Li

Authors

Dian Shi
View author publications
You can also search for this author in PubMed Google Scholar
Xuanheng Li
View author publications
You can also search for this author in PubMed Google Scholar
Ming Li
View author publications
You can also search for this author in PubMed Google Scholar
Jie Wang
View author publications
You can also search for this author in PubMed Google Scholar
Pan Li
View author publications
You can also search for this author in PubMed Google Scholar
Miao Pan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Miao Pan .

Editor information

Editors and Affiliations

University of Hawaii at Manoa, Honolulu, HI, USA
Edoardo S. Biagioni
University of Hawaii at Manoa, Honolulu, USA
Yao Zheng
Harbin Institute of Technology, Harbin, China
Siyao Cheng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shi, D., Li, X., Li, M., Wang, J., Li, P., Pan, M. (2019). Optimal Transportation Network Company Vehicle Dispatching via Deep Deterministic Policy Gradient. In: Biagioni, E., Zheng, Y., Cheng, S. (eds) Wireless Algorithms, Systems, and Applications. WASA 2019. Lecture Notes in Computer Science(), vol 11604. Springer, Cham. https://doi.org/10.1007/978-3-030-23597-0_24

Download citation

DOI: https://doi.org/10.1007/978-3-030-23597-0_24
Published: 21 June 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-23596-3
Online ISBN: 978-3-030-23597-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics