Abstract
Efficient evaluation of distributed computation on large-scale data is prominent in modern scientific computation; especially analysis of big data, image processing and data mining applications. This problem is particularly challenging in distributed environments such as campus clusters, grids or clouds on which the basic computation routines are offered as web/cloud services. In this paper, we propose a locality-aware workflow-based solution for evaluation of large-scale matrix expressions in a distributed environment. Our solution is based on automatic generation of BPEL workflows in order to coordinate long running, asynchronous and parallel invocation of services. We optimize the input expression in order to maximize parallel execution of independent operations while reducing the matrix transfer cost to a minimum. Our approach frees the end-user of the system from the burden of writing and debugging lengthy BPEL workflows. We evaluated our solution on realistic mathematical expressions executed on large-scale matrices distributed on multiple clouds.
Chapter PDF
Similar content being viewed by others
References
Nassar, M., Erradi, A., Sabri, F., Malluhi, Q.: Secure Outsourcing of Matrix Operations as a Service. In: 6th IEEE International Conference on Cloud Computing, pp. 918–925. IEEE Press (2013)
Web Services Business Process Execution Language v2.0, http://docs.oasis-open.org/wsbpel/2.0/OS/wsbpel-v2.0-OS.html
Van der Aalst, W.M.P., ter Hofstede, A.: YAWL: Yet Another Workflow Language. Information Systems 30(4), 245–275 (2005)
Taverna Workflow Management System, http://www.taverna.org.uk/
Altintas, I., Berkley, C., Jaeger, E., Jones, M.: Ludascher. B., Mock, S.: Kepler: an extensible system for design and execution of scientific workflows. In: Scientific and Statistical Database Management International Conference, pp. 423–424 (2004)
Sonntag, M., Karastoyanova, D., Deelman, E.: BPEL4Pegasus: Combining Business and Scientific Workflows. In: Maglio, P.P., Weske, M., Yang, J., Fantinato, M. (eds.) ICSOC 2010. LNCS, vol. 6470, pp. 728–729. Springer, Heidelberg (2010)
Apache ODE: http://ode.apache.org/
WebSphere Application Server Enterprise Process Choreographer, http://www.ibm.com/developerworks/websphere/
Dustdar, S., Schreiner, W.: A survey on web services composition. Journal of Web and Grid Services 1(1), 1–30 (2005)
Rao, J., Su, X.: A Survey of Automated Web Service Composition Methods. In: Cardoso, J., Sheth, A.P. (eds.) SWSWPC 2004. LNCS, vol. 3387, pp. 43–54. Springer, Heidelberg (2005)
Trainotti, M., Pistore, M., Calabrese, G., Zacco, G., Lucchese, G., Barbon, F., Bertoli, P.G., Traverso, P.: ASTRO: Supporting Composition and Execution of Web Services. In: Benatallah, B., Casati, F., Traverso, P. (eds.) ICSOC 2005. LNCS, vol. 3826, pp. 495–501. Springer, Heidelberg (2005)
Ouyang, C., Dumas, M., ter Hofstede, A.H.M., van der Aalst, W.M.P.: Pattern-based translation of BPMN process models to BPEL web services. International Journal of Web Services Research 5(1), 42–62 (2007)
Yuan, P., Jin, H., Yuan, S., Cao, W., Jiang, L.: WFTXB: A Tool for Translating Between XPDL and BPEL. In: 10th IEEE International Conference on High Performance Computing and Communications, pp. 647–652. IEEE Press (2008)
JEP (Java Expression Parser), http://www.singularsys.com/jep
Kastner, R., Hosangadi, A., Fallah, F.: Arithmetic Optimization Techniques for Hardware and Software Design. Cambridge University Press, Cambridge (2010)
Bacon, D., Graham, S., Sharp, O.: Compiler Transformations for High-Performance Computing. ACM Computing Surveys 26(4), 345–420 (1994)
Parr, T., Fisher, K.: LL(*): The Foundation of the ANTLR Parser Generator. In: Programming Language Design and Implementation Conference (PLDI), pp. 425–436 (2011)
Cormen, T., Leiserson, C., Rivest, R., Stein, C.: Introduction to Algorithms, 3rd edn., pp. 370–377. MIT Press (2009)
Hameurlain, A.: Evolution of Query Optimization Methods: From Centralized Database Systems to Data Grid Systems. In: Bhowmick, S.S., Küng, J., Wagner, R. (eds.) DEXA 2009. LNCS, vol. 5690, pp. 460–470. Springer, Heidelberg (2009)
Evrendilke, C., Dogac, A., Nural, S., Ozcan, F.: Multidatabase query optimization. Journal of Distributed and Parallel Databases 5(1), 77–114 (1997)
Zeng, L., Benatllah, B., Ngu, A.H.H., Dumas, M., Kalagnanam, J., Chang, H.: QoS-Aware Middleware for Web Services Composition. IEEE Transactions On Software Engineering 30(5), 311–327 (2004)
Unify framework package, Software Languages Lab, Vrije Universiteit Brussel, http://soft.vub.ac.be/svn-gen/unify/src/org/unify_framework/
Dean, J., Ghemawat, S.: MapReduce: Simplified Data Processing on Large Clusters. Communications of the ACM - 50th anniversary issue 51(1), 107–113 (2008)
Yuan, Y., Isard, M., Fetterly, D., Budiu, M., Erlingsson, U., Gunda, P.K., Currey, J.: DryadLINQ: A system for general-purpose distributed data-parallel computing using a high-level language. In: OSDI 2008 Proceedings of the 8th USENIX Symposium on Operating Systems Design and Implementation, pp. 1–14 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sabry, F., Erradi, A., Nassar, M., Malluhi, Q.M. (2014). Automatic Generation of Optimized Workflow for Distributed Computations on Large-Scale Matrices. In: Franch, X., Ghose, A.K., Lewis, G.A., Bhiri, S. (eds) Service-Oriented Computing. ICSOC 2014. Lecture Notes in Computer Science, vol 8831. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-45391-9_6
Download citation
DOI: https://doi.org/10.1007/978-3-662-45391-9_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-45390-2
Online ISBN: 978-3-662-45391-9
eBook Packages: Computer ScienceComputer Science (R0)