Abstract
The explosive increase of scientific data brings in the “Fourth Paradigm” research method by Jim Gray. In order to accelerate the processing speed for these big data, parallel distributed processing is needed. As the data-intensive computing requires high throughput of IO, the data transfer from different node should be cut down as much as possible. Current technologies focus more on the framework for local reliable network with homogeneous resources, but the parallel processing framework for scientific data-intensive problems such as spatial data shared with the Internet and queried by semantics is not fully studied. In this article, we proposed a new data-intensive parallel processing framework for spatial data—Robinia DSSSD (Distributed Storage and Service for Spatial Data), which provides the flexible ability to support data distribution and allocation across the Internet, and semantics query. Experiments shows that Robinia DSSSD can achieve good acceleration with low overhead, and it can well support data-intensive computing.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Hey, T., Tansley, S. and Tolle, K. (2010). The fourth paradigm of science research – a brief introduction to Jim Gray on eScience: A transformed scientific method. e-Science Technology and Application, V1(2), 92–94.
Hey, T., Tansley, S., & Tolle, K. (2009). The fourth paradigm: Data-intensive scientific discovery (pp. xvii–xxxi). Redmond, WA: Microsoft Research.
Dean, J., & Ghemawat, S. (2008). MapReduce: Simplified data processing on large clusters. Communications of the ACM, 51(1), 107–113.
Hadoop. http://hadoop.apache.org/
Shvachko, K., Kuang, H., Radia, S. & Chansler, R. (2010). The hadoop distributed file system. In 2010 I.E. 26th Symposium on Mass Storage Systems and Technologies (pp. 1–10). Piscataway, NJ: IEEE.
Han, J., Haihong, E., Le, G. & Du, J. (2011). Survey on NoSQL database. In 2011 6th International Conference on Pervasive Computing and Applications (ICPCA) (pp. 363–366). Piscataway, NJ: IEEE.
Chang, F., Dean, J., Ghemawat, S., Hsieh, W. C., Wallach, D. A., Burrows, M., & Gruber, R. E. (2008). Bigtable: A distributed storage system for structured data. ACM Transactions on Computer Systems (TOCS), 26(2), 4.
HyperTable. http://hypertable.org/
HBase. http://hbase.apache.org/
Bunch, C., Chohan, N., Krintz, C., Chohan, J., Kupferman, J., Lakhina, P., & Nomura, Y. (2010). An evaluation of distributed datastores using the AppScale cloud platform. In 2010 I.E. 3rd International Conference on Cloud Computing (CLOUD) (pp. 305–312). Piscataway, NJ: IEEE.
MongoDB. http://www.mongodb.org/
CouchDB. http://couchdb.apache.org/
Redis. http://redis.io/
Perrey, R., & Lycett, M. (2003). Service-oriented architecture. In 2003 Symposium on Applications and the Internet Workshops (pp. 116–119). Piscataway, NJ: IEEE.
Richardson, L., & Ruby, S. (2008). RESTful web services (pp. 49–79). Sebastopol, CA: O’Reilly Media.
Gao, B. C. (1996). NDWI-a normalized difference water index for remote sensing of vegetation liquid water from space. Remote Sensing of Environment, 58(3), 257–266.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Zhao, D., Gu, Y., Huang, Z. (2014). A New Data-Intensive Parallel Processing Framework for Spatial Data. In: Wong, W.E., Zhu, T. (eds) Computer Engineering and Networking. Lecture Notes in Electrical Engineering, vol 277. Springer, Cham. https://doi.org/10.1007/978-3-319-01766-2_43
Download citation
DOI: https://doi.org/10.1007/978-3-319-01766-2_43
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-01765-5
Online ISBN: 978-3-319-01766-2
eBook Packages: EngineeringEngineering (R0)