Advertisement

Map Reduce Autoscaling over the Cloud with Process Mining Monitoring

  • Federico Chesani
  • Anna Ciampolini
  • Daniela LoretiEmail author
  • Paola Mello
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 740)

Abstract

Over the last years, the traditional pressing need for fast and reliable processing solutions has been further exacerbated by the increase of data volumes – produced by mobile devices, sensors and almost ubiquitous internet availability. These big data must be analyzed to extract further knowledge.

Distributed programming models, such as Map Reduce, are providing a technical answer to this challenge. Furthermore, when relaying on cloud infrastructures, Map Reduce platforms can easily be runtime provided with additional computing nodes (e.g., the system administrator can scale the infrastructure to face temporal deadlines). Nevertheless, the execution of distributed programming models on the cloud still lacks automated mechanisms to guarantee the Quality of Service (i.e., autonomous scale-up/-down behavior).

In this paper, we focus on the steps of monitoring Map Reduce applications (to detect situations where the temporal deadline will be exceeded) and performing recovery actions on the cluster (by automatically providing additional resources to boost the computation). To this end, we exploit some techniques and tools developed in the research field of Business Process Management: in particular, we focus on declarative languages and tools for monitoring the execution of business process. We introduce a distributed architecture where a logic-based monitor is able to detect possible delays, and trigger recovery actions such as the dynamic provisioning of a congruent number of resources.

Keywords

Business Process Management Map Reduce Cloud computing Autonomic system 

References

  1. 1.
    Amazon Cloud Watch (2016). https://aws.amazon.com/it/cloudwatch/. Accessed July 2016
  2. 2.
    Apache Hadoop (2016). https://hadoop.apache.org/. Accessed July 2016
  3. 3.
    Apache Spark (2016). http://spark.apache.org. Accessed July 2016
  4. 4.
    Armbrust, M., Fox, O., R., G.: Above the clouds: a Berkeley view of cloud computing. Technical rep., Electrical Engineering and Computer Sciences, University of California at Berkeley (2009)Google Scholar
  5. 5.
    Bauer, A., Leucker, M., Schallhart, C.: Runtime verification for LTL and TLTL. ACM Trans. Softw. Eng. Methodol. 20(4), 14:1–14:64 (2011). http://doi.acm.org/10.1145/2000799.2000800 CrossRefGoogle Scholar
  6. 6.
    Chen, K., Powers, J., Guo, S., Tian, F.: CRESP: towards optimal resource provisioning for MapReduce computing in public clouds. IEEE Trans. Parallel Distrib. Syst. 25(6), 1403–1412 (2014)CrossRefGoogle Scholar
  7. 7.
    Chen, M., Mao, S., Liu, Y.: Big Data: a survey. Mob. Netw. Appl. 19(2), 171–209 (2014)CrossRefGoogle Scholar
  8. 8.
    Collins, E.: Intersection of the Cloud and Big Data. IEEE Cloud Comput. 1(1), 84–85 (2014)CrossRefGoogle Scholar
  9. 9.
    Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008). http://doi.acm.org/10.1145/1327452.1327492 CrossRefGoogle Scholar
  10. 10.
    Ekanayake, J., Li, H., Zhang, B.: Twister: a runtime for iterative Map Reduce. In: Proceedings of the First International Workshop on Map Reduce and its Application of ACM HPDC Conference (2010)Google Scholar
  11. 11.
    Farrel, A., Sergot, M., Sallè, M., Bartolini, C.: Using the event calculus for tracking the normative state of contracts. Int. J. Coop. Inf. Syst. 14(02n03), 99–129 (2005). http://www.worldscientific.com/doi/abs/10.1142/S0218843005001110
  12. 12.
    Giannakopoulou, D., Havelund, K.: Automata-based verification of temporal properties on running programs. In: Proceedings of 16th Annual International Conference on Automated Software Engineering (ASE 2001), pp. 412–416, November 2001Google Scholar
  13. 13.
    Kailasam, S., Dhawalia, P., Balaji, S., Iyer, G., Dharanipragada, J.: Extending MapReduce across clouds with BStream. IEEE Trans. Cloud Comput. 2(3), 362–376 (2014)CrossRefGoogle Scholar
  14. 14.
    Kowalski, R.A., Sergot, M.J.: A logic-based calculus of events. New Gener. Comput. 4, 67–95 (1986)CrossRefzbMATHGoogle Scholar
  15. 15.
    Loreti, D., Ciampolini, A.: A hybrid cloud infrastructure of Big Data applications. In: Proceedings of IEEE International Conferences on High Performance Computing and Communications (2015)Google Scholar
  16. 16.
    Mattess, M., Calheiros, R., Buyya, R.: Scaling MapReduce applications across hybrid clouds to meet soft deadlines. In: 2013 IEEE 27th International Conference on Advanced Information Networking and Applications (AINA), pp. 629–636, March 2013Google Scholar
  17. 17.
    Montali, M., Chesani, F., Mello, P., Maggi, F.M.: Towards data-aware constraints in declare. In: Shin, S.Y., Maldonado, J.C. (eds.) Proceedings of the 28th Annual ACM Symposium on Applied Computing, SAC 2013, Coimbra, Portugal, 18–22 March 2013, pp. 1391–1396. ACM (2013). http://doi.acm.org/10.1145/2480362.2480624
  18. 18.
    Montali, M., Maggi, F.M., Chesani, F., Mello, P., van der Aalst, W.M.P.: Monitoring business constraints with the event calculus. ACM TIST 5(1), 17 (2013). http://doi.acm.org/10.1145/2542182.2542199 Google Scholar
  19. 19.
    OpenStack Ceilometer (2016). https://wiki.openstack.org/wiki/Ceilometer. Accessed July 2016
  20. 20.
    Palanisamy, B., Singh, A., Liu, L.: Cost-effective resource provisioning for MapReduce in a cloud. IEEE Trans. Parallel Distrib. Syst. 26(5), 1265–1279 (2015)CrossRefGoogle Scholar
  21. 21.
    Pesic, M., Aalst, W.M.P.: A declarative approach for flexible business processes management. In: Eder, J., Dustdar, S. (eds.) BPM 2006. LNCS, vol. 4103, pp. 169–180. Springer, Heidelberg (2006). doi: 10.1007/11837862_18 CrossRefGoogle Scholar
  22. 22.
    Rizvandi, N.B., Taheri, J., Moraveji, R., Zomaya, A.Y.: A study on using uncertain time series matching algorithms for MapReduce applications. Concurrency Comput. Pract. Experience 25(12), 1699–1718 (2013). http://dx.doi.org/10.1002/cpe.2895 CrossRefGoogle Scholar
  23. 23.
    Spanoudakis, G., Mahbub, K.: Non-intrusive monitoring of service-based systems. Int. J. Coop. Inf. Syst. 15(03), 325–358 (2006). http://www.worldscientific.com/doi/abs/10.1142/S0218843006001384
  24. 24.
    Van Der Aalst, W.M.P.: Distributed process discovery and conformance checking. In: Lara, J., Zisman, A. (eds.) FASE 2012. LNCS, vol. 7212, pp. 1–25. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-28872-2_1 CrossRefGoogle Scholar
  25. 25.
    Van Der Aalst, W., et al.: Process mining manifesto. In: Daniel, F., Barkaoui, K., Dustdar, S. (eds.) BPM 2011. LNBIP, vol. 99, pp. 169–194. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-28108-2_19 CrossRefGoogle Scholar
  26. 26.
    Verma, A., Cherkasova, L., Campbell, R.H.: Resource provisioning framework for MapReduce jobs with performance goals. In: Kon, F., Kermarrec, A.-M. (eds.) Middleware 2011. LNCS, vol. 7049, pp. 165–186. Springer, Heidelberg (2011). doi: 10.1007/978-3-642-25821-3_9 CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Federico Chesani
    • 1
  • Anna Ciampolini
    • 1
  • Daniela Loreti
    • 1
    Email author
  • Paola Mello
    • 1
  1. 1.DISI - Department of Computer Science and EngineeringUniversità di BolognaBolognaItaly

Personalised recommendations