Distributed SECONDO: A Highly Available and Scalable System for Spatial Data Processing
Cassandra is a highly available and scalable data store but it provides only limited capabilities for data analyses. However, database management systems (DBMS) provide a lot of functions to analyze data but most of them scale poorly. In this paper, a novel method is proposed to couple Cassandra with a DBMS. The result is a highly available and scalable system that provides all the functions from the DBMS in a distributed manner. Cassandra is used as a data store and the DBMS Secondo is used as a query processing engine. Secondo is an extensible DBMS, it provides various data models, e.g. models for spatial data and moving objects data. With Distributed Secondo functions like spatial joins can be performed distributed and parallelized on many computers.
- 1.Güting, R.H., Behr, T., Düntgen, C.: SECONDO: a platform for moving objects database research and for publishing and integrating research implementations. IEEE Data Eng. Bull. 33(2), 56–63 (2010)Google Scholar
- 2.Lu, J., Güting, R.H.: Parallel SECONDO: a practical system for large-scale processing of moving objects. In: IEEE 30th International Conference on Data Engineering, Chicago, ICDE, pp. 1190–1193 (2014)Google Scholar
- 3.Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. In: Proceedings of the 6th Conference on Symposium on Operating Systems Design and Implementation, OSDI 2004, vol. 6, San Francisco, pp, 137–150 (2004)Google Scholar
- 7.Open Street Map. http://www.openstreetmap.org