Abstract
With the emergence of cloud computing and other web technologies, data storage has become very easy and cheap. Hence, petabytes and zettabytes of data are in cloud storage on a daily basis through various portals such as YouTube and Facebook. This causes the need for big data analysis and data-intensive computing. In this paper, we perform a comprehensive survey on various data-intensive computing techniques and MapReduce paradigm mechanisms with cloud computing. We first provide data-intensive computing methodologies and then describe MapReduce algorithm for data-intensive computing on big data analysis.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Shvachko, Hairong Kuang, Radia S., Chansler R.: The Hadoop Distributed file system. In: IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), (2010)
Moretti, C., Bulosan, J., Thain, D., Flynn, P.J.: All-pairs: an abstraction for data intensive cloud computing. In: IPDPS, April 2008
Grossman, R., Gu, Y.: Data mining using high performance data clouds: experimental studies using sector and sphere. In: Proceeding of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD-08, ACM, New York, NY, USA (2008)
Yu, Y., Isard, M., Fetterly, D., Budiu, M., Erlingsson, U., Gunda, P.K., Currey, J.: Dryadlinq: a system for general-purpose distributed data-parallel computing using a high-level language. In: Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation, OSDI-08, USENIX Association, Berkeley, CA, USA (2008)
Jianwu Wang, Crawl, D., Altintas, I.: Kepler+Hadoop: a general architecture facilitating data-intensive applications in scientific workflow systems. In: WORKS 09, Portland Oregon, 15 Nov 2009
Dean, J., Ghemawat S.: Mapreduce: simplified data processing on large clusters. Commun. ACM 51, (2008)
Chen, R., Chen, H., Zang, B.: Tiled-mapreduce: optimizing resource usages of data-parallel applications on multicore with tiling. In: Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, PACT10, ACM, New York, NY, USA (2010)
Ranger, C., Raghuraman, R., Penmetsa, A., Bradski, G.R., Kozyrakis, C.: Evaluating mapreduce for multi-core and multiprocessor systems. In: 13st International Conference on High-Performance Computer Architecture (2007)
Rafique, M.M., Rose, B., Butt, A.R., Nikolopoulos, D.S.: Supporting mapreduce on large-scale asymmetric multi-core clusters. SIGOPS Operat. Syst. Rev. 43, 25–34 (2009)
Ibrahim, S., Jin, H., Cheng, B., Cao, H., Wu, S., Qi, L.: Cloudlet: towards mapreduce implementation on virtual machines. In: Proceedings of the 18th ACM International Symposium on High Performance Distributed Computing, HPDC 09, ACM, New York, NY, USA (2009).
Jin, C., Buyya, R.: Mapreduce programming model for.net-based cloud computing. In: Euro-Par. Lecture Notes in Computer Science, vol. 5704. Springer (2009)
Miceli, C., Miceli, M., Jha, S., Kaiser, H., Merzky, A.: Programming abstractions for data intensive computing on clouds and grids. In: Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, CCGRID09, IEEE Computer Society, Washington, DC, USA (2009)
Dou, A., Kalogeraki, V., Gunopulos, D., Mielikainen, T., Tuulos, V.H.: Misco: a mapreduce framework for mobile systems. In: Proceedings of the 3rd International Conference on PErvasive Technologies Related to Assistive Environments, PETRA10, ACM, New York, NY, USA (2010).
Lizhe Wanga, Jie Tao, Rajiv Ranjan, Holger Marten, Achim Streit, Jingying Chen, Dan Chena: G-Hadoop, MapReduce across distributed data centers for data intensive computing. Future Gener. Comput. Syst. 29, (2013)
Xia Wei-Lei, Wang Li-Song: Research on and implementation of parallel ant colony algorithm based on MapReduce. Electron. Sci. Technol. 26(2), 146 (2013)
Kiran, M., Kumar, A., Mukherjee, S., Ravi Prakash, G.: Program model for parallel support vector machine algorithm on hadoop cluster. Int. J. Comput. Sci. Iss. 10(3), 317–325 (2013). no. 1, (May 2013)
Acknowledgment
The authors would like to acknowledge Dr. Ganesh Neelakanta Iyer for his valuable comments in completing this research work.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer India
About this paper
Cite this paper
Iyer, G.N., Silas, S. (2015). A Comprehensive Survey on Data-Intensive Computing and MapReduce Paradigm in Cloud Computing Environments. In: Rajsingh, E., Bhojan, A., Peter, J. (eds) Informatics and Communication Technologies for Societal Development. Springer, New Delhi. https://doi.org/10.1007/978-81-322-1916-3_9
Download citation
DOI: https://doi.org/10.1007/978-81-322-1916-3_9
Published:
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-322-1915-6
Online ISBN: 978-81-322-1916-3
eBook Packages: EngineeringEngineering (R0)