Abstract
Currently, big data and large-scale data processing techniques has become an important developing area. MapReduce is an enabling technology of cloud computing. Hadoop is one of the most popular MapReduce implementation, which is the target platform in this paper. When running a MapReduce job, programmers however cannot acquire the information about how to fine-tune the parameters of application. Moreover, programmers need much time on finding the most suitable parameters. This paper evaluates execution processes in MapReduce and form SPN-MR model with Stochastic Petri Net. In order to analyze the performance of SPN-MR, formulas of mean delay time in each time transition are defined. SPN-MR simulates the elapsed time of any MapReduce jobs with known input data sizes and then reduces time cost in performance tuning. SPN-MR carried out several actual test benchmarks. The results showed the average error rate is within 5 percent. Therefore, it can provide effective performance evaluation reports for MapReduce programmers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Zaharia, M., Konwinski, A., Joseph, A., Katz, R., Stoica, I.: Improving MapReduce performance in heterogeneous environments. In: OSDI 2008: 8th USENIX Symposium on Operating Systems Design and Implementation (2008)
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. In: OSDI 2004, pp. 137–150 (2004)
White, T., Cutting, D.: Hadoop: The Definitive Guide. O’REILLY Media, Sebastopol (2009). Chapter 6. How MapReduce Works
Murata, T.: Petri nets: properties, analysis and applications. Proc. IEEE 77(4), 561–580 (1989)
Saavedra-Barrera, R.H., Culler, D.E., Eicken, T.V.: Analysis of multithreaded architectures for parallel computing. In: 2nd Annual ACM Symposium on Parallel Algorithms and Architectures (1990)
Marsan, M.A., Conte, G., Balbo, G.: A class of generalized stochastic petri nets for the performance evaluation of multiprocessor systems. ACM Trans. Comput. Syst. 2(2), 93–122 (1984)
Ferscha, A.: A petri net approach for performance oriented parallel program design. J. Parallel Distrib. Comput. 15(3), 188–206 (1992). Special Issue on Petri Net Modelling of Parallel Computers
Khazaei, H., Misic, J., Misic, V.B.: Performance analysis of cloud computing centers using M/G/m/m + r queuing systems. IEEE Trans. Parallel Distrib. Syst. 23(5), 936–943 (2012)
Wang, G., Butt, A.R., Pandey, P., Gupta, K.: A Simulation Approach to Evaluating design decisions in MapReduce setups. In: IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (2009)
Liu, Y., Li, M., Alham, N.K., Hammoud, S.: HSim: A MapReduce simulator in enabling cloud computing. Future Gener. Comput. Syst. 29, 300–308 (2013)
Tian, F., Chen, K.: Towards optimal resource provisioning for running MapReduce programs in public clouds. In: IEEE International Conference on Cloud Computing (CLOUD) (2011)
Ganapathi, A., Chen, Y., Fox, A., Katz, R., Patterson, D.: Statistics-driven workload modeling for the cloud. In: IEEE 26th International Conference on Data Engineering Workshops (ICDEW) (2010)
Yang, H., Luan, Z., Li, W., Qian, D.: MapReduce workload modeling with statistical approach. J. Grid Comput. 10, 279–310 (2012)
Dingle, N.J., Knottenbelt, W.J., Suto, T.: PIPE2: A tool for the performance evaluation of generalised stochastic petri nets. ACM SIGMETRICS Perform. Eval. Rev. 36(4), 34–39 (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cheng, ST., Wang, HC., Chen, YJ., Chen, CF. (2015). Performance Analysis Using Petri Net Based MapReduce Model in Heterogeneous Clusters. In: Chiu, D., et al. Advances in Web-Based Learning – ICWL 2013 Workshops. ICWL 2013. Lecture Notes in Computer Science(), vol 8390. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-46315-4_18
Download citation
DOI: https://doi.org/10.1007/978-3-662-46315-4_18
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-46314-7
Online ISBN: 978-3-662-46315-4
eBook Packages: Computer ScienceComputer Science (R0)