A Comprehensive Survey on Data-Intensive Computing and MapReduce Paradigm in Cloud Computing Environments

Iyer, Girish Neelakanta; Silas, Salaja

doi:10.1007/978-81-322-1916-3_9

Girish Neelakanta Iyer⁴ &
Salaja Silas⁴

567 Accesses

Abstract

With the emergence of cloud computing and other web technologies, data storage has become very easy and cheap. Hence, petabytes and zettabytes of data are in cloud storage on a daily basis through various portals such as YouTube and Facebook. This causes the need for big data analysis and data-intensive computing. In this paper, we perform a comprehensive survey on various data-intensive computing techniques and MapReduce paradigm mechanisms with cloud computing. We first provide data-intensive computing methodologies and then describe MapReduce algorithm for data-intensive computing on big data analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Shvachko, Hairong Kuang, Radia S., Chansler R.: The Hadoop Distributed file system. In: IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), (2010)
Google Scholar
Moretti, C., Bulosan, J., Thain, D., Flynn, P.J.: All-pairs: an abstraction for data intensive cloud computing. In: IPDPS, April 2008
Google Scholar
Grossman, R., Gu, Y.: Data mining using high performance data clouds: experimental studies using sector and sphere. In: Proceeding of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD-08, ACM, New York, NY, USA (2008)
Google Scholar
Yu, Y., Isard, M., Fetterly, D., Budiu, M., Erlingsson, U., Gunda, P.K., Currey, J.: Dryadlinq: a system for general-purpose distributed data-parallel computing using a high-level language. In: Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation, OSDI-08, USENIX Association, Berkeley, CA, USA (2008)
Google Scholar
Jianwu Wang, Crawl, D., Altintas, I.: Kepler+Hadoop: a general architecture facilitating data-intensive applications in scientific workflow systems. In: WORKS 09, Portland Oregon, 15 Nov 2009
Google Scholar
Dean, J., Ghemawat S.: Mapreduce: simplified data processing on large clusters. Commun. ACM 51, (2008)
Google Scholar
Chen, R., Chen, H., Zang, B.: Tiled-mapreduce: optimizing resource usages of data-parallel applications on multicore with tiling. In: Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, PACT10, ACM, New York, NY, USA (2010)
Google Scholar
Ranger, C., Raghuraman, R., Penmetsa, A., Bradski, G.R., Kozyrakis, C.: Evaluating mapreduce for multi-core and multiprocessor systems. In: 13st International Conference on High-Performance Computer Architecture (2007)
Google Scholar
Rafique, M.M., Rose, B., Butt, A.R., Nikolopoulos, D.S.: Supporting mapreduce on large-scale asymmetric multi-core clusters. SIGOPS Operat. Syst. Rev. 43, 25–34 (2009)
Article Google Scholar
Ibrahim, S., Jin, H., Cheng, B., Cao, H., Wu, S., Qi, L.: Cloudlet: towards mapreduce implementation on virtual machines. In: Proceedings of the 18th ACM International Symposium on High Performance Distributed Computing, HPDC 09, ACM, New York, NY, USA (2009).
Google Scholar
Jin, C., Buyya, R.: Mapreduce programming model for.net-based cloud computing. In: Euro-Par. Lecture Notes in Computer Science, vol. 5704. Springer (2009)
Google Scholar
Miceli, C., Miceli, M., Jha, S., Kaiser, H., Merzky, A.: Programming abstractions for data intensive computing on clouds and grids. In: Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, CCGRID09, IEEE Computer Society, Washington, DC, USA (2009)
Google Scholar
Dou, A., Kalogeraki, V., Gunopulos, D., Mielikainen, T., Tuulos, V.H.: Misco: a mapreduce framework for mobile systems. In: Proceedings of the 3rd International Conference on PErvasive Technologies Related to Assistive Environments, PETRA10, ACM, New York, NY, USA (2010).
Google Scholar
Lizhe Wanga, Jie Tao, Rajiv Ranjan, Holger Marten, Achim Streit, Jingying Chen, Dan Chena: G-Hadoop, MapReduce across distributed data centers for data intensive computing. Future Gener. Comput. Syst. 29, (2013)
Google Scholar
Xia Wei-Lei, Wang Li-Song: Research on and implementation of parallel ant colony algorithm based on MapReduce. Electron. Sci. Technol. 26(2), 146 (2013)
Google Scholar
Kiran, M., Kumar, A., Mukherjee, S., Ravi Prakash, G.: Program model for parallel support vector machine algorithm on hadoop cluster. Int. J. Comput. Sci. Iss. 10(3), 317–325 (2013). no. 1, (May 2013)
Google Scholar

Download references

Acknowledgment

The authors would like to acknowledge Dr. Ganesh Neelakanta Iyer for his valuable comments in completing this research work.

Author information

Authors and Affiliations

Department of Information Technology, Karunya University, Coimbatore, TN, India
Girish Neelakanta Iyer & Salaja Silas

Authors

Girish Neelakanta Iyer
View author publications
You can also search for this author in PubMed Google Scholar
Salaja Silas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Girish Neelakanta Iyer .

Editor information

Editors and Affiliations

School of Computer Science and Technolog, Karunya University, Coimbatore, Tamil Nadu, India
Elijah Blessing Rajsingh
School of Computing, National University of Singapore, Singapore, Singapore
Anand Bhojan
Department of Information Technology, Karunya University, Coimbatore, Tamil Nadu, India
J. Dinesh Peter

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Iyer, G.N., Silas, S. (2015). A Comprehensive Survey on Data-Intensive Computing and MapReduce Paradigm in Cloud Computing Environments. In: Rajsingh, E., Bhojan, A., Peter, J. (eds) Informatics and Communication Technologies for Societal Development. Springer, New Delhi. https://doi.org/10.1007/978-81-322-1916-3_9

Download citation

DOI: https://doi.org/10.1007/978-81-322-1916-3_9
Published: 04 July 2014
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-322-1915-6
Online ISBN: 978-81-322-1916-3
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics