Abstract
State-of-the-art train delay prediction systems do not exploit historical train movements data collected by the railway information systems, but they rely on static rules built by expert of the railway infrastructure based on classical univariate statistic. The purpose of this paper is to build a data-driven train delay prediction system for large-scale railway networks which exploits the most recent Big Data technologies and learning algorithms. In particular, we propose a fast learning algorithm for predicting train delays based on the Extreme Learning Machine that fully exploits the recent in-memory large-scale data processing technologies. Our system is able to rapidly extract nontrivial information from the large amount of data available in order to make accurate predictions about different future states of the railway network. Results on real world data coming from the Italian railway network show that our proposal is able to improve the current state-of-the-art train delay prediction systems.
L. Oneto—This research has been supported by the European Union through the projects Capacity4Rail (European Union’s Seventh Framework Programme for research, technological development and demonstration under grant agreement 605650) and In2Rail (European Union’s Horizon 2020 research and innovation programme under grant agreement 635900).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Anguita, D., Ghio, A., Oneto, L., Ridella, S.: In-sample and out-of-sample model selection and error estimation for support vector machines. IEEE Trans. Neural Netw. Learn. Syst. 23(9), 1390–1406 (2012)
Berger, A., Gebhardt, A., Müller-Hannemann, M., Ostrowski, M.: Stochastic delay prediction in large train networks. In: OASIcs-OpenAccess Series in Informatics (2011)
Cambria, E., Huang, G.B.: Extreme learning machines. IEEE Intell. Syst. 28(6), 30–59 (2013)
Caruana, R., Lawrence, S., Lee, G.: Overfitting in neural nets: backpropagation, conjugate gradient, and early stopping. In: Neural Information Processing Systems (2001)
Cordeau, J.F., Toth, P., Vigo, D.: A survey of optimization models for train routing and scheduling. Transp. Sci. 32(4), 380–404 (1998)
Dollevoet, T., Corman, F., D’Ariano, A., Huisman, D.: An iterative optimization framework for delay management and train scheduling. Flex. Serv. Manuf. J. 26(4), 490–515 (2014)
Efron, B., Tibshirani, R.J.: An Introduction to the Bootstrap. Chapman & Hall, New York (1993)
Figueres-Esteban, M., Hughes, P., Van Gulijk, C.: The role of data visualization in railway big data risk analysis. In: European Safety and Reliability Conference (2015)
Fumeo, E., Oneto, L., Anguita, D.: Condition based maintenance in railway transportation systems based on big data streaming analysis. In: The INNS Big Data conference (2015)
Google: Google Compute Engine (2016). https://cloud.google.com/compute/. Accessed 3 May 2016
Goverde, R.M.P.: A delay propagation algorithm for large-scale railway traffic networks. Transp. Res. Part C: Emerg. Technol. 18(3), 269–287 (2010)
Hansen, I.A., Goverde, R.M.P., Van Der Meer, D.J.: Online train delay recognition and running time prediction. In: IEEE International Conference on Intelligent Transportation Systems (2010)
Huang, G., Huang, G.B., Song, S., You, K.: Trends in extreme learning machines: a review. Neural Netw. 61, 32–48 (2015)
Huang, G.B., Chen, L., Siew, C.K.: Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans. Neural Netw. 17(4), 879–892 (2006)
Huang, G.B., Zhou, H., Ding, X., Zhang, R.: Extreme learning machine for regression and multiclass classification. IEEE Trans. Syst. Man Cybern. Part B: Cybern. 42(2), 513–529 (2012)
Huang, G.B., Zhu, Q.Y., Siew, C.K.: Extreme learning machine: a new learning scheme of feedforward neural networks. In: IEEE International Joint Conference on Neural Networks (2004)
Kecman, P.: Models for predictive railway traffic management (Ph.D. thesis). TU Delft, Delft University of Technology (2014)
Kecman, P., Goverde, R.M.P.: Online data-driven adaptive prediction of train event times. IEEE Trans. Intell. Transp. Syst. 16(1), 465–474 (2015)
Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: International Joint Conference on Artificial Intelligence (1995)
Li, H., Parikh, D., He, Q., Qian, B., Li, Z., Fang, D., Hampapur, A.: Improving rail network velocity: a machine learning approach to predictive maintenance. Transp. Res. Part C: Emerg. Technol. 45, 17–26 (2014)
Meng, X., Bradley, J., Yavuz, B., Sparks, E., Venkataraman, S., Liu, D., Freeman, J., Amde, M., Owen, S., Xin, D., Xin, R., Franklin, M.J., Zadeh, R., Zaharia, M., Talwalkar, A.: MLlib: machine learning in apache spark. J. Mach. Learn. Res. 17(34), 1–7 (2016)
Milinković, S., Marković, M., Vesković, S., Ivić, M., Pavlović, N.: A fuzzy petri net model to estimate train delays. Simul. Model. Prac. Theor. 33, 144–157 (2013)
Morris, C., Easton, J., Roberts, C.: Applications of linked data in the rail domain. In: IEEE International Conference on Big Data (2014)
Müller-Hannemann, M., Schnee, M.: Efficient timetable information in the presence of delays. In: Ahuja, R.K., Möhring, R.H., Zaroliagis, C.D. (eds.) Robust and Online Large-Scale Optimization. LNCS, vol. 5868, pp. 249–272. Springer, Heidelberg (2009). doi:10.1007/978-3-642-05465-5_10
Núñez, A., Hendriks, J., Li, Z., De Schutter, B., Dollevoet, R.: Facilitating maintenance decisions on the dutch railways using big data: the aba case study. In: IEEE International Conference on Big Data (2014)
Oneto, L., Orlandi, I., Anguita, D.: Performance assessment and uncertainty quantification of predictive models for smart manufacturing systems. In: IEEE International Conference on Big Data (Big Data) (2015)
Oneto, L., Pilarz, B., Ghio, A., D., A.: Model selection for big data: algorithmic stability and bag of little bootstraps on gpus. In: European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (2015)
Packard, N.H., Crutchfield, J.P., Farmer, J.D., Shaw, R.S.: Geometry from a time series. Phys. Rev. Lett. 45(9), 712 (1980)
Pongnumkul, S., Pechprasarn, T., Kunaseth, N., Chaipah, K.: Improving arrival time prediction of thailand’s passenger trains using historical travel times. In: International Joint Conference on Computer Science and Software Engineering (2014)
Prechelt, L.: Automatic early stopping using cross validation: quantifying the criteria. Neural Netw. 11(4), 761–767 (1998)
Reyes-Ortiz, J.L., Oneto, L., Anguita, D.: Big data analytics in the cloud: spark on hadoop vs mpi/openmp on beowulf. In: The INNS Big Data Conference (2015)
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Cogn. Model. 5(3), 1 (1988)
Shoro, A.G., Soomro, T.R.: Big data analysis: apache spark perspective. Glob. J. Comput. Sci. Technol. 15(1) (2015)
Thaduri, A., Galar, D., Kumar, U.: Railway assets: a potential domain for big data analytics. In: The INNS Big Data conference (2015)
Vapnik, V.N.: An overview of statistical learning theory. IEEE Trans. Neural Netw. 10(5), 988–999 (1999)
Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M.J., Shenker, S., Stoica, I.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: USENIX Conference on Networked Systems Design and Implementation (2012)
Zarembski, A.M.: Some examples of big data in railroad engineering. In: IEEE International Conference on Big Data (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Oneto, L. et al. (2017). Delay Prediction System for Large-Scale Railway Networks Based on Big Data Analytics. In: Angelov, P., Manolopoulos, Y., Iliadis, L., Roy, A., Vellasco, M. (eds) Advances in Big Data. INNS 2016. Advances in Intelligent Systems and Computing, vol 529. Springer, Cham. https://doi.org/10.1007/978-3-319-47898-2_15
Download citation
DOI: https://doi.org/10.1007/978-3-319-47898-2_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-47897-5
Online ISBN: 978-3-319-47898-2
eBook Packages: EngineeringEngineering (R0)