Abstract
Traffic increasingly shapes the trajectory of city growth and impacts on the climate change in modern cities. Traffic patterns’ monitoring can provide with innovative practices in understanding city traffic dynamics, especially via utilizing sensory and textual data analytics. State-of-the-art research recently has focused on processing voluminous real time data in vast quantities by capturing real time sensory observations and/or social network (textual) data regarding city traffic. In this paper, we investigate the feasibility of using Big Data produced by Twitter textual streams for extracting traffic related events. After describing a generic yet innovative application used for data capturing, we preprocess this data so they fit into the structuring of the machine learning models for clustering (unsupervised learning) and classification (supervised learning). For the case of clustering we use Apache Spark on a MapR sandbox with the use of KMeans algorithm. For the classification case we compare various machine learning methodologies including Multi-Layer Perceptron Neural Networks, (MLP-NN), Support Vector Machines, (SVM) and a Deep Convolutional Learning, (DCL) approach to contextualize citizen observations and responses via tweets. The criteria of precision, accuracy, recall and F-score are used as statistical metrics to determine the accuracy and performance of each model. Our experiments include clustering, a 2-class and a 3-class classification, where, MLP-NN gave accuracy of 89.6%, SVM 92.73% and DCL was inferior performing at 81.76%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Liu, B., Hu, M., Cheng, J.: Opinion observer: analyzing and comparing opinions on the web. In: Proceedings of the 14th International Conference on World Wide Web, pp. 342–351. ACM (2005)
Cao, J., Zeng, K., Wang, H., Cheng, J., Qiao, F., Wen, D., Gao, Y.: Web-based traffic sentiment analysis: methods and applications. IEEE Trans. Intell. Transport. Syst. 15(2), 844–853 (2014)
Kim, S.M., Hovy, E.: Extracting opinions, opinion holders, and topics expressed in online news media text. In: Proceedings of the Workshop on Sentiment and Subjectivity in Text. Association for Computational Linguistics, pp. 1–8 (2006)
Stieglitza, S., Mirbabaiea, M., Rossa, B., Neubergerb, C.: Social media analytics – challenges in topic discovery, data collection, and data preparation. Int. J. Inf. Manag. 39, 156–168 (2018)
Atefeh, F., Khreich, W.: A survey of techniques for event detection in Twitter. Comput. Intell. 31(1), 132–164 (2015)
Ruchi, P., Kamalakar, K.: ET: events from tweets. In: Proceedings of the 22nd International Conference of World Wide Web Computing, Rio de Janeiro (2013)
Twitraffic Homepage. https://uk-traffic-news-twitraffic.soft112.com/. Accessed 10 Dec 2017
Carvalho, J., Rosa, H., Brogueira, G., Batista, F.: MISNIS: an intelligent platform for Twitter topic mining. Expert Syst. Appl. 89, 374–388 (2017)
Arın, I., Erpam, M., Saygın, Y.: I-TWEC: interactive clustering tool for Twitter. Expert Syst. Appl. 96, 1–13 (2018)
Liu, H., Ge, Y., Zheng, Q., Lin, R., Li, H.: Detecting global and local topics via mining Twitter data. Neurocomputing 273, 120–132 (2018)
Alamy, I., Ahmedy, M., Alamy, M., Ulissesz, J., Faridy, D., Shatabday, S., Rossettiz, R.: Pattern mining from historical traffic Big Data. In: IEEE Region 10 Symposium (TENSYMP) (2017)
Guerreiro, G., Figueiras, P., Silva, R., Costa, R. Goncalves, R.: An architecture for Big Data processing on intelligent transportation systems. In: IEEE 8th International Conference on Intelligent Systems (2016). ISBN 978-1-5090-1354-8/16/$31.00
Guo, Y., Zhang, J., Zhang, Y.: A Method of traffic congestion state detection based on mobile Big Data. In: IEEE 2nd International Conference on Big Data Analysis (2017). ISBN 978-1-5090-3619-6/17/$31.00
Cosine Similarity. https://en.wikipedia.org/wiki/Cosine_similarity. Accessed 10 Dec 2017
Montazeri-Gh, M., Fotouhi, A.: Traffic condition recognition using the K-means clustering method. Trans. B Mech. Eng. Sci. Iran. 18(4), 930–937 (2011)
Zhong, S.: Efficient online spherical K-means clustering. In: Proceedings of IEEE International Joint Conference on Neural Networks. Published in IJCNN (2005)
Twitter4J: Java Library for Twitter Mining. http://twitter4j.org/en/. Accessed 17 Dec 2017
Habibi, M.: Real World Regular Expressions with Java 1.4. Springer, Berlin (2004)
Hotho, A., Nürnberger, A., Paaß, G.: A brief survey of text mining, LDV Forum-GLDV. J. Comput. Linguist. Lang. Technol. 20(1), 19–62 (2005)
Zhou, Y., Cao, Z.-W.: Research on the construction and filter method of stop-word list in text preprocessing. In: Proceedings of the 4th ICICTA, Shenzhen, vol. 1, pp. 217–221, (2011)
Porter, M.F.: An algorithm for suffix stripping. Program 14(3), 130–137 (1980). Program electronic library and information systems
Aiello, L.-C., Petkos, G., Martin, C., Corney, D., Papadopoulos, S., Skraba, R., Göker, A.: Sensing trending topics in Twitter. IEEE Trans. Multimed. 15(6), 1268–1282 (2013)
APRIL-ANN Toolkit: https://github.com/april-org. Accessed 16 Nov 2017
Platt, J.: Fast training of support vector machines using sequential minimal optimization. In: Schoelkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods: Support Vector Learning, pp 185–208. MIT Press, Cambridge (1999)
Severyn, A., Moschitti, A.: Twitter sentiment analysis with deep convolutional neural networks. In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2015, Santiago, pp. 950–962 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Kokkinos, K., Nathanail, E., Papageorgiou, E. (2019). Applying Unsupervised and Supervised Machine Learning Methodologies in Social Media Textual Traffic Data. In: Nathanail, E., Karakikes, I. (eds) Data Analytics: Paving the Way to Sustainable Urban Mobility. CSUM 2018. Advances in Intelligent Systems and Computing, vol 879. Springer, Cham. https://doi.org/10.1007/978-3-030-02305-8_80
Download citation
DOI: https://doi.org/10.1007/978-3-030-02305-8_80
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-02304-1
Online ISBN: 978-3-030-02305-8
eBook Packages: EngineeringEngineering (R0)