Skip to main content

Map Reduce Autoscaling over the Cloud with Process Mining Monitoring

  • Conference paper
  • First Online:

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 740))

Abstract

Over the last years, the traditional pressing need for fast and reliable processing solutions has been further exacerbated by the increase of data volumes – produced by mobile devices, sensors and almost ubiquitous internet availability. These big data must be analyzed to extract further knowledge.

Distributed programming models, such as Map Reduce, are providing a technical answer to this challenge. Furthermore, when relaying on cloud infrastructures, Map Reduce platforms can easily be runtime provided with additional computing nodes (e.g., the system administrator can scale the infrastructure to face temporal deadlines). Nevertheless, the execution of distributed programming models on the cloud still lacks automated mechanisms to guarantee the Quality of Service (i.e., autonomous scale-up/-down behavior).

In this paper, we focus on the steps of monitoring Map Reduce applications (to detect situations where the temporal deadline will be exceeded) and performing recovery actions on the cluster (by automatically providing additional resources to boost the computation). To this end, we exploit some techniques and tools developed in the research field of Business Process Management: in particular, we focus on declarative languages and tools for monitoring the execution of business process. We introduce a distributed architecture where a logic-based monitor is able to detect possible delays, and trigger recovery actions such as the dynamic provisioning of a congruent number of resources.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Available for download at https://www.inf.unibz.it/~montali/tools.html#MOBUCON.

References

  1. Amazon Cloud Watch (2016). https://aws.amazon.com/it/cloudwatch/. Accessed July 2016

  2. Apache Hadoop (2016). https://hadoop.apache.org/. Accessed July 2016

  3. Apache Spark (2016). http://spark.apache.org. Accessed July 2016

  4. Armbrust, M., Fox, O., R., G.: Above the clouds: a Berkeley view of cloud computing. Technical rep., Electrical Engineering and Computer Sciences, University of California at Berkeley (2009)

    Google Scholar 

  5. Bauer, A., Leucker, M., Schallhart, C.: Runtime verification for LTL and TLTL. ACM Trans. Softw. Eng. Methodol. 20(4), 14:1–14:64 (2011). http://doi.acm.org/10.1145/2000799.2000800

    Article  Google Scholar 

  6. Chen, K., Powers, J., Guo, S., Tian, F.: CRESP: towards optimal resource provisioning for MapReduce computing in public clouds. IEEE Trans. Parallel Distrib. Syst. 25(6), 1403–1412 (2014)

    Article  Google Scholar 

  7. Chen, M., Mao, S., Liu, Y.: Big Data: a survey. Mob. Netw. Appl. 19(2), 171–209 (2014)

    Article  Google Scholar 

  8. Collins, E.: Intersection of the Cloud and Big Data. IEEE Cloud Comput. 1(1), 84–85 (2014)

    Article  Google Scholar 

  9. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008). http://doi.acm.org/10.1145/1327452.1327492

    Article  Google Scholar 

  10. Ekanayake, J., Li, H., Zhang, B.: Twister: a runtime for iterative Map Reduce. In: Proceedings of the First International Workshop on Map Reduce and its Application of ACM HPDC Conference (2010)

    Google Scholar 

  11. Farrel, A., Sergot, M., Sallè, M., Bartolini, C.: Using the event calculus for tracking the normative state of contracts. Int. J. Coop. Inf. Syst. 14(02n03), 99–129 (2005). http://www.worldscientific.com/doi/abs/10.1142/S0218843005001110

  12. Giannakopoulou, D., Havelund, K.: Automata-based verification of temporal properties on running programs. In: Proceedings of 16th Annual International Conference on Automated Software Engineering (ASE 2001), pp. 412–416, November 2001

    Google Scholar 

  13. Kailasam, S., Dhawalia, P., Balaji, S., Iyer, G., Dharanipragada, J.: Extending MapReduce across clouds with BStream. IEEE Trans. Cloud Comput. 2(3), 362–376 (2014)

    Article  Google Scholar 

  14. Kowalski, R.A., Sergot, M.J.: A logic-based calculus of events. New Gener. Comput. 4, 67–95 (1986)

    Article  MATH  Google Scholar 

  15. Loreti, D., Ciampolini, A.: A hybrid cloud infrastructure of Big Data applications. In: Proceedings of IEEE International Conferences on High Performance Computing and Communications (2015)

    Google Scholar 

  16. Mattess, M., Calheiros, R., Buyya, R.: Scaling MapReduce applications across hybrid clouds to meet soft deadlines. In: 2013 IEEE 27th International Conference on Advanced Information Networking and Applications (AINA), pp. 629–636, March 2013

    Google Scholar 

  17. Montali, M., Chesani, F., Mello, P., Maggi, F.M.: Towards data-aware constraints in declare. In: Shin, S.Y., Maldonado, J.C. (eds.) Proceedings of the 28th Annual ACM Symposium on Applied Computing, SAC 2013, Coimbra, Portugal, 18–22 March 2013, pp. 1391–1396. ACM (2013). http://doi.acm.org/10.1145/2480362.2480624

  18. Montali, M., Maggi, F.M., Chesani, F., Mello, P., van der Aalst, W.M.P.: Monitoring business constraints with the event calculus. ACM TIST 5(1), 17 (2013). http://doi.acm.org/10.1145/2542182.2542199

    Google Scholar 

  19. OpenStack Ceilometer (2016). https://wiki.openstack.org/wiki/Ceilometer. Accessed July 2016

  20. Palanisamy, B., Singh, A., Liu, L.: Cost-effective resource provisioning for MapReduce in a cloud. IEEE Trans. Parallel Distrib. Syst. 26(5), 1265–1279 (2015)

    Article  Google Scholar 

  21. Pesic, M., Aalst, W.M.P.: A declarative approach for flexible business processes management. In: Eder, J., Dustdar, S. (eds.) BPM 2006. LNCS, vol. 4103, pp. 169–180. Springer, Heidelberg (2006). doi:10.1007/11837862_18

    Chapter  Google Scholar 

  22. Rizvandi, N.B., Taheri, J., Moraveji, R., Zomaya, A.Y.: A study on using uncertain time series matching algorithms for MapReduce applications. Concurrency Comput. Pract. Experience 25(12), 1699–1718 (2013). http://dx.doi.org/10.1002/cpe.2895

    Article  Google Scholar 

  23. Spanoudakis, G., Mahbub, K.: Non-intrusive monitoring of service-based systems. Int. J. Coop. Inf. Syst. 15(03), 325–358 (2006). http://www.worldscientific.com/doi/abs/10.1142/S0218843006001384

  24. Van Der Aalst, W.M.P.: Distributed process discovery and conformance checking. In: Lara, J., Zisman, A. (eds.) FASE 2012. LNCS, vol. 7212, pp. 1–25. Springer, Heidelberg (2012). doi:10.1007/978-3-642-28872-2_1

    Chapter  Google Scholar 

  25. Van Der Aalst, W., et al.: Process mining manifesto. In: Daniel, F., Barkaoui, K., Dustdar, S. (eds.) BPM 2011. LNBIP, vol. 99, pp. 169–194. Springer, Heidelberg (2012). doi:10.1007/978-3-642-28108-2_19

    Chapter  Google Scholar 

  26. Verma, A., Cherkasova, L., Campbell, R.H.: Resource provisioning framework for MapReduce jobs with performance goals. In: Kon, F., Kermarrec, A.-M. (eds.) Middleware 2011. LNCS, vol. 7049, pp. 165–186. Springer, Heidelberg (2011). doi:10.1007/978-3-642-25821-3_9

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daniela Loreti .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Chesani, F., Ciampolini, A., Loreti, D., Mello, P. (2017). Map Reduce Autoscaling over the Cloud with Process Mining Monitoring. In: Helfert, M., Ferguson, D., Méndez Muñoz, V., Cardoso, J. (eds) Cloud Computing and Services Science. CLOSER 2016. Communications in Computer and Information Science, vol 740. Springer, Cham. https://doi.org/10.1007/978-3-319-62594-2_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-62594-2_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-62593-5

  • Online ISBN: 978-3-319-62594-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics