Multi-node Approach for Map Data Processing
OpenStreetMap (OSM) is a popular collaborative open-source project that offers free editable map across the whole world. However, this data often needs a further on-purpose processing to become the utmost valuable information to work with. That is why the main motivation of this paper is to propose a design for big data processing along with data mining leading to the obtaining of statistics with a focus on the detail of a traffic data as a result in order to create graphs representing a road network. To ensure our High-Performance Computing (HPC) platform routing algorithms work correctly, it is absolutely essential to prepare OSM data to be useful and applicable for above-mentioned graph, and to store this persistent data in both spatial database and HDF5 format.
KeywordsOpenStreetMap Road network quality Big data parsing Multi-node processing ETL State machine Pipeline
This work has been partially funded by ANTAREX, a project supported by the EU H2020 FET-HPC program under grant 671623, by The Ministry of Education, Youth and Sports of the Czech Republic from the National Programme of Sustainability (NPU II) project ‘IT4 Innovations excellence in science—LQ1602’.
- 1.OpenStreetMap. https://www.openstreetmap.org
- 4.Autotuning and Adaptivity appRoach for Energy Efficient eXascale HPC Systems. http://www.antarex-project.eu
- 6.Davidovic, N., Mooney, P.: Patterns of tagging in openstreetmap data in urban areas. In: Proceedings of GISRUK (2016)Google Scholar
- 7.Jilani, M., Corcoran, P., Bertolotto, M.: Automated highway tag assessment of openstreetmap road networks. In: Proceedings of the 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pp. 449–452. ACM (2014)Google Scholar
- 8.Goetz, M., Lauer, J., Auer, M.: An algorithm based methodology for the creation of a regularly updated global online map derived from volunteered geographic information. In: Proceedings of the Fourth International Conference on Advanced Geographic Information Systems, Applications, and Services, Valencia, Spain, vol. 30, pp. 50–58 (2012)Google Scholar
- 10.Luxen, D., Vetter, C.: Real-time routing with openstreetmap data. In: Proceedings of the 19th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pp. 513–516. ACM (2011)Google Scholar
- 11.Geofabrik. https://www.geofabrik.de
- 12.Protocol Buffers. https://github.com/google/protobuf/
- 13.The HDF Group. https://www.hdfgroup.org
- 14.PostgreSQL. https://www.postgresql.org
- 15.Portable Batch System. https://www.nas.nasa.gov/hecc/support/kb/portable-batch-system-(pbs)-overview_126.html
- 16.Supercomputer Salomon Hardware Overview. https://docs.it4i.cz/salomon/hardware-overview/