Skip to main content

Framework for Geospatial Query Processing by Integrating Cassandra with Hadoop

  • Chapter
  • First Online:

Abstract

Nowadays we are moving towards digitization and making all our devices such as sensors, cameras connected to Internet producing big data. This big data has variety of data and has paved the way for the emergence of NoSQL databases, like Cassandra for achieving scalability and availability. Hadoop framework has been developed for storing and processing distributed data. In this work, we mainly investigated on storage and retrieval of geospatial data by integrating Hadoop and Cassandra using prefix-based partitioning and Cassandra’s default partitioning algorithm, i.e. Murmur3Partitioner techniques. Geohash value is generated that acts as a partition key and also helps in effective search. Hence, the time taken for retrieving data is optimized. When user requests for spatial queries like finding nearest locations, searching in Cassandra database starts using both partitioning techniques. A comparison on query response time is made so as to verify which method is more effective. Results showed that prefix-based partitioning technique is efficient than Murmur3 partitioning technique.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Aji, A., Wang, F., Vo, H., Lee, R., Liu, Q., Zhang, X., et al. (2013). Hadoop-GIS: A high performance spatial data warehousing system over MapReduce. Proceedings of VLDB Endowment, 6(11), 1009.

    Google Scholar 

  2. Benkirane, M., & Kettani, D. (2017). www.aui.ma/personal/~D.Kettani/courses/gis/GDB-benkirane.ppt Last accessed April 12, 2017.

  3. Berry, J. K. (1987). Fundamental operations in computer-assisted map analysis. International Journal of GIS, 1, 119–136.

    Google Scholar 

  4. Bobov, R. (2017). Spatial data visualization spatial data. https://portal.opengeospatial.org/files/?artifact_id=73214. Last accessed April 12, 2017.

  5. Brahim, M. B., Drira, W., Filali, F., & Hamdi, N. (2016). Spatial data extension for Cassandra NoSQL database. Journal of Big Data, 3, 11.

    Google Scholar 

  6. DataStax Apache Cassandra Documentation. (2016). http://www.odbms.org/wp-content/uploads/2013/11/cassandra10.pdf. Last accessed October 20, 2016.

  7. Dubey, N. K., & Agrawalan, S. (2015). Efficient approach to find nearest location using geohashing on Hadoop and Pig. International Journal of Engineering Research-Online, 3(3), 771–777.

    Google Scholar 

  8. Fox, A., Eichelberger, C., Hughes, J., & Lyon, S. (2013). Spatio-temporal indexing in non-relational distributed databases. Commonwealth Computer Research, Inc. IEEE.

    Google Scholar 

  9. Geohash and Its Format. http://geohash.org/site/tips.htmlLast. Accessed January 3, 2016.

  10. Hadoop Support. (2017). https://docs.datastax.com/en/cassandra/2.1/cassandra/configuration/configHadoop.html. Last accessed March 11, 2017.

  11. Hadoop vs. Cassandra. (2017). https://www.datastax.com/nosql-databases/nosql-cassandra-and-hadoop. Last accessed April 12, 2017.

  12. Lakhshman, A., & Malik, P. (2010). Cassandra: A decentralized structured storage system. ACM SIGOPS Operating System Review, 44(2), 35–40.

    Google Scholar 

  13. Lee, D. T. (1982). On k-nearest neighbor Voronoi diagrams in the Plane. IEEE Transactions Computers.

    Google Scholar 

  14. Lee, K., Ganti, R. K., Srivatsa, M., & Liu, L. (2014). Efficient spatial query processing for big data. In ACM SIGSPATIAL ’14, November 04–07, 2014.

    Google Scholar 

  15. Lenka, R. K., Barik, R. K., Gupta, N., Ali, S. M., Rath, A., & Dubey, H. (2016). Comparative analysis of SpatialHadoop and GeoSpark for geospatial big data analytics. Cornell University Library.

    Google Scholar 

  16. Liao, H., Han, J., & Fang, J. (2010). Multi-dimensional index on Hadoop distributed file system. In Proceedings of IEEE Fifth International Conference on Networking, Architecture, and Storage (pp. 240–249).

    Google Scholar 

  17. Liu, X., Han, J., Zhong, Y., Han, C., & He, X. (2009). Implementing WebGIS on Hadoop: A case study of improving small file I/O performance on HDFS. In Proceedings of IEEE International Conference on Cluster Computing and Workshops (pp. 1–8).

    Google Scholar 

  18. Moniruzzaman, A. B., & Hossain, S. A. (2013). Nosql database: New era of databases for big data analytics—Classification, characteristics and comparison. International Journal of Database Theory and Application, 6(4), 1–13.

    Google Scholar 

  19. Movable Type Scripts: Geohashes. http://www.movable-type.co.uk/scripts/geohash.html. Last accessed April 12, 2017.

  20. Tang, M., Yu, Y., Aref, W. G., Mahmood, A. R., Malluhi, Q. M., & Ouzzani, M. (2016). In-memory distributed spatial query processing and optimization. Purdue Technical Report 2016.

    Google Scholar 

  21. What are Longitudes and Latitudes. https://www.timeanddate.com/geography/longitude-latitude.html. Last accessed April 11, 2017.

  22. Zhang, S., Han, J., Liu, Z., Wang, K., & Feng, S. (2009). Spatial queries evaluation with MapReduce. In Proceedings of GCC ‘09.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to S. Vasavi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Vasavi, S., Padma Priya, M., Gokhale, A.A. (2018). Framework for Geospatial Query Processing by Integrating Cassandra with Hadoop. In: Margret Anouncia, S., Wiil, U. (eds) Knowledge Computing and Its Applications. Springer, Singapore. https://doi.org/10.1007/978-981-10-6680-1_7

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-6680-1_7

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-6679-5

  • Online ISBN: 978-981-10-6680-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics