Reliable MapReduce computing on opportunistic resources

Lin, Heshan; Ma, Xiaosong; Feng, Wu-chun

doi:10.1007/s10586-011-0158-7

Reliable MapReduce computing on opportunistic resources

Published: 27 February 2011

Volume 15, pages 145–161, (2012)
Cite this article

Cluster Computing Aims and scope Submit manuscript

Heshan Lin¹,
Xiaosong Ma² &
Wu-chun Feng¹

294 Accesses
14 Citations
Explore all metrics

Abstract

MapReduce offers an ease-of-use programming paradigm for processing large data sets, making it an attractive model for opportunistic compute resources. However, unlike dedicated resources, where MapReduce has mostly been deployed, opportunistic resources have significantly higher rates of node volatility. As a consequence, the data and task replication scheme adopted by existing MapReduce implementations is woefully inadequate on such volatile resources.

In this paper, we propose MOON, short for MapReduce On Opportunistic eNvironments, which is designed to offer reliable MapReduce service for opportunistic computing. MOON adopts a hybrid resource architecture by supplementing opportunistic compute resources with a small set of dedicated resources, and it extends Hadoop, an open-source implementation of MapReduce, with adaptive task and data scheduling algorithms to take advantage of the hybrid resource architecture. Our results on an emulated opportunistic computing system running atop a 60-node cluster demonstrate that MOON can deliver significant performance improvements to Hadoop on volatile compute resources and even finish jobs that are not able to complete in Hadoop.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Hadoop. http://hadoop.apache.org/core/
Spot Instances on Amazon EC2. http://aws.amazon.com/ec2/spot-instances/
Adya, A., Bolosky, W., Castro, M., Chaiken, R., Cermak, G., Douceur, J., Howell, J., Lorch, J., Theimer, M., Wattenhofer, R.: FARSITE: federated, available, and reliable storage for an incompletely trusted environment. In: Proceedings of the 5th Symposium on Operating Systems Design and Implementation (2002)
Google Scholar
Anderson, D.: Boinc: a system for public-resource computing and storage. In: IEEE/ACM International Workshop on Grid Computing (2004)
Google Scholar
Apple Inc. Xgrid. http://www.apple.com/server/macosx/technology/xgrid.html
Averitt, S., Bugaev, M., Peeler, A., Shaffer, H., Sills, E., Stein, S., Thompson, J., Vouk, M.: Virtual computing laboratory (VCL). In: International of the International Conference on Virtual Computing Initiative (2007)
Google Scholar
Chen, S., Schlosser, S.: Map-reduce meets wider varieties of applications meets wider varieties of applications. Technical report IRP-TR-08-05, Intel research (2008)
Chien, A., Calder, B., Elbert, S., Bhatia, K.: Entropia: Architecture and performance of an enterprise desktop grid system. J. Parallel Distrib. Comput. 63, 597–610 (2003)
Article Google Scholar
Chun, B.-G., Dabek, F., Haeberlen, A., Sit, E., Weatherspoon, H., Kaashoek, M.F., Kubiatowicz, J., Morris, R.: Efficient replica maintenance for distributed storage systems. In: NSDI’06: Proceedings of the 3rd conference on Networked Systems Design & Implementation, Berkeley, CA, USA, pp. 4–4. USENIX Association, Berkeley (2006)
Google Scholar
Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)
Article Google Scholar
Fedak, G., He, H., Cappello, F.: Bitdew: a programmable environment for large-scale data management and distribution. In: SC ’08: Proceedings of the 2008 ACM/IEEE conference on Supercomputing, Piscataway, NJ, USA, pp. 1–12. IEEE Press, New York (2008)
Google Scholar
Gharaibeh, A., Ripeanu, M.: Exploring data reliability tradeoffs in replicated storage systems. In: HPDC ’09: Proceedings of the 18th ACM international symposium on High performance distributed computing, New York, NY, USA, pp. 217–226. ACM, New York (2009)
Chapter Google Scholar
Ghemawat, S., Gobioff, H., Leung, S.: The Google file system. In: Proceedings of the 19th Symposium on Operating Systems Principles (2003)
Google Scholar
Grant, M., Sehrish, S., Bent, J., Wang, J.: Introducing map-reduce to high end computing. In: 3rd Petascale Data Storage Workshop, Nov (2008)
Google Scholar
GridGain Systems, LLC. Gridgain. http://www.gridgain.com/
Gupta, A., Lin, B., Dinda, P.A.: Measuring and understanding user comfort with resource borrowing. In: HPDC ’04: Proceedings of the 13th IEEE International Symposium on High Performance Distributed Computing, Washington, DC, USA, pp. 214–224. IEEE Computer Society, Los Alamitos (2004)
Google Scholar
Haeberlen, A., Mislove, A., Druschel, P.: Glacier: Highly durable, decentralized storage despite massive correlated failures. In: Proceedings of the 2nd Symposium on Networked Systems Design and Implementation (NSDI’05), May (2005)
Google Scholar
Ko, S., Hoque, I., Cho, B., Gupta, I.: On availability of intermediate data in cloud computations. In: 12th Workshop on Hot Topics in Operating Systems (HotOS XII) (2009)
Google Scholar
Kondo, D., Taufe, M., Brooks, C., Casanova, H., Chien, A.: Characterizing and evaluating desktop grids: an empirical study. In: Proceedings of the 18th International Parallel and Distributed Processing Symposium (2004)
Google Scholar
Matsunaga, A., Tsugawa, M., Fortes, J.: Cloudblast: combining mapreduce and virtualization on distributed resources for bioinformatics. In: Microsoft eScience Workshop (2008)
Google Scholar
Strickland, J., Freeh, V., Ma, X., Vazhkudai, S.: Governor: Autonomic throttling for aggressive idle resource scavenging. In: Proceedings of the 2nd IEEE International Conference on Autonomic Computing (2005)
Google Scholar
Sun Microsystems. Compute server. https://computeserver.dev.java.net/
Thain, D., Tannenbaum, T., Livny, M.: Distributed computing in practice: the condor experience. In: Concurrency and Computation: Practice and Experience (2004)
Google Scholar
Vazhkudai, S., Ma, X., Freeh, V., Strickland, J., Tammineedi, N., Scott, S.: Freeloader: scavenging desktop storage resources for bulk, transient data. In: Proceedings of Supercomputing (2005)
Google Scholar
Zaharia, M., Konwinski, A., Joseph, A., Katz, R., Stoica, I.: Improving mapreduce performance in heterogeneous environments. In: OSDI (2008)
Google Scholar
Zhong, M., Shen, K., Seiferas, J.: Replication degree customization for high availability. SIGOPS Oper. Syst. Rev. 42(4), 55–68 (2008)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Virginia Tech, Blacksburg, USA
Heshan Lin & Wu-chun Feng
Oak Ridge National Laboratory, North Carolina State University, Raleigh, USA
Xiaosong Ma

Authors

Heshan Lin
View author publications
You can also search for this author in PubMed Google Scholar
Xiaosong Ma
View author publications
You can also search for this author in PubMed Google Scholar
Wu-chun Feng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Heshan Lin.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lin, H., Ma, X. & Feng, Wc. Reliable MapReduce computing on opportunistic resources. Cluster Comput 15, 145–161 (2012). https://doi.org/10.1007/s10586-011-0158-7

Download citation

Received: 02 November 2010
Accepted: 13 January 2011
Published: 27 February 2011
Issue Date: June 2012
DOI: https://doi.org/10.1007/s10586-011-0158-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Reliable MapReduce computing on opportunistic resources

Abstract

Access this article

Similar content being viewed by others

A survey of Kubernetes scheduling algorithms

Edge computing: current trends, research challenges and future directions

Dynamic resource allocation in cloud computing: analysis and taxonomies

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Reliable MapReduce computing on opportunistic resources

Abstract

Access this article

Similar content being viewed by others

A survey of Kubernetes scheduling algorithms

Edge computing: current trends, research challenges and future directions

Dynamic resource allocation in cloud computing: analysis and taxonomies

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation