Abstract
Stragglers can temporize jobs and reduce cluster efficiency seriously. Many researches have been contributed to the solution, such as Blacklist[8], speculative execution[1, 6], Dolly[8]. In this paper, we put forward a new approach for mitigating stragglers in MapReduce, name Hummer. It starts task clones only for high-risk delaying tasks. Related experiments have been carried and results show that it can decrease the job delaying risk with fewer resources consumption. For small jobs, Hummer also improves job completion time by 48% and 10% compared to LATE and Dolly.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ananthanarayanan, G., Kandula, S., Greenberg, A., Stoica, I., Harris, E., Saha, B.: Reining in the Outliers in Map-Reduce Clusters using Mantri. In: Proc. of the USENIX OSDI (2010)
Kwon, Y., Balazinska, M., Howe, B., Rolia, J.: SkewTune: Mitigating skew in MapReduce applications. In: Proc. of the SIGMOD Conf., pp. 25–36 (2012)
Dean, J., Ghemawat, S.: Mapreduce: Simplified data processing on large clusters. In: Proc. of the USENIX OSDI (2004)
Zaharia, M., Konwinski, A., Joseph, A.D., Katz, R., Stoica, I.: Improving MapReduce Performance in Heterogeneous Environments. In: Proc. of the USENIX OSDI (2008)
Kwon, Y., Balazinska, M., Howe, B., Rolia, J.: SkewTune in action (demonstration). Proc. of the VLDB Endowment 5(12), 1934–1937 (2012)
Ananthanarayanan, G., Ghodsi, A., Shenker, S., Stoica, I.: Effective Straggler Mitigation: Attack of the Clones. In: Proc. of the USENIX NSDI (2013)
Ananthanarayanan, G., Hung, M.C.-C., Ren, X., Stoica, I., Wierman, A., Yu, M.: GRASS: Trimming Stragglers in Approximation Analytics. In: Proc. of the 11th USENIX NSDI (2014)
Kwon, Y., Balazinska, M., Howe, B., Rolia, J.: A Study of Skew in MapReduce Applications. In: Proc. of the Open Cirrus Summit (2011)
Chen, Y., Alspaugh, S., Borthakur, D., Katz, R.: Energy Efficiency for Large-Scale MapReduce Workloads with Significant Interactive Analysis. In: Proc. of the ACM EuroSys (2012)
Barroso, L.A.: Warehouse-scale computing: Entering the teenage decade. In: Proc. of the ISCA (2011)
Resnick, S.: Heavy-tail phenomena: probabilistic and statistical modeling. Springer (2007)
Cirne, W., Paranhos, D., Brasileiro, F., Goes, L.F.W., Voorsluys, W.: On the Efficacy, Efficiency and Emergent Behavior of Task Replication in Large Distributed Systems. Parallel Computing 33(3), 213–234 (2007)
Hadoop, http://hadoop.apache.org/
Ananthanarayanan, G., Ghodsi, A., Shenker, S., Stoica, I.: Why Let Resources Idle? Aggressive Cloning of Jobs with Dolly. In: Proc. of the HotCloud (2012)
Ousterhout, K., Wendell, P., Zaharia, M., Stoica, I.: Sparrow: Distributed, Low-Latency Scheduling. In: Proc. of the SOSP (2013)
Ghodsi, A., Zaharia, M., Shenker, S., Stoica, I.: Choosy: Max-Min Fair Sharing for Datacenter Jobs with Constraints. In: Proc. of the EuroSys (2013)
Zaharia, M., Borthakur, D., Sarma, J.S., Elmeleegy, K., Shenker, S., Stoica, I.: Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling. In: EuroSys 2010: Proceedings of the 5th European Conference on Computer Systems, pp. 265–278. ACM, New York (2010)
Gittins, J.C.: Bandit Processes and Dynamic Allocation Indices. Journal of the Royal Statistical Society. Series B (Methodological) (1979)
Sonin, I.: A Generalized Gittins Index for a Markov Chain and Its Recursive Calculation. Statistics & Probability Letters (2008)
Dean, J.: Achieving Rapid Response Times in Large Online Services., http://research.google.com/People/jeff/latency.html
Ren, K., Kwon, Y., Balazinska, M., Howe, B.: Hadoop’s Adolescence: An Analysis of Hadoop Usage in Scientific Workloads. In: Proc. of the VLDB (2013)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Li, J., Wang, C., Li, D., Huang, Z. (2015). Partial Clones for Stragglers in MapReduce. In: Wang, H., et al. Intelligent Computation in Big Data Era. ICYCSEE 2015. Communications in Computer and Information Science, vol 503. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-46248-5_14
Download citation
DOI: https://doi.org/10.1007/978-3-662-46248-5_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-46247-8
Online ISBN: 978-3-662-46248-5
eBook Packages: Computer ScienceComputer Science (R0)