Abstract
Workflow orchestration is a method which smartly organizes the enterprise function with application, data, and infrastructure. The applications as well as their infrastructure can be dynamically scaled up or down using orchestration. On the contrary, integration enables the development of new applications with the capability to connect to any other application through specified interfaces. In this chapter, firstly, the opportunities and challenges in workflow orchestration and integration are explained. Following that, BioCloud, an architecture that demonstrates the task-based workflow orchestration using two bioinformatics workflows is explained in detail.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Andrews T, Curbera F, Dholakia H, Goland Y, Klein J, Leymann F, Liu K, Roller D, Smith D, Thatte S, Trickovic I, Weerawarana S, Business process execution language for web services version 1.1
Oinn T, Addis M, Ferris J, Marvin D, Senger M, Greenwood M, Carver T, Glover K, Pocock MR, Wipat A, Li P (2004) Taverna: a tool for the composition and enactment of bioinformatics workflows. Bioinformatics 20(17):3045–3054
Condor team, DAGMan: a Directed Acyclic Graph Manager, July 2005. http://www.cs.wisc.edu/condor/dagman/
Deelman E, Singh G, Su M-H, Blythe J, Gil Y, Kesselman C, Mehta G, Vahi K, Berriman GB, Good J, Laity A, Jacob JC, Katz D (2005) Pegasus: a framework for mapping complex scientific workflows onto distributed systems. Sci Program J 13(3):219–237
Berriman G, Good J, Laity A, Bergou A, Jacob J, Katz D, Deelman E, Kesselman C, Singh G, Su M et al, Montage: a grid enabled image mosaic service for the national virtual observatory. In: Astronomical data analysis software and systems, ADASS, XIII
Berriman G, Deelman E, Good J, Jacob J, Katz D, Kesselman C, Laity A, Prince T, Singh G, Su M (2004) Montage: a grid-enabled engine for delivering custom science-grade mosaics on demand. In: Proceedings of SPIE 5493, pp 221–232
Lathers A, Su M, Kulungowski A, Lin A, Mehta G, Peltier S, Deelman E, Ellisman M, Enabling parallel scientific applications with workflow tools. In: Proceedings of Challenges of Large Applications in Distributed Environments, CLADE
Muench J et al, SCEC earthworks science gateway: widening SCEC community access to the TeraGrid. In: TeraGrid 2006 conference
Lord H (1995) Improving the application development process with modular visualization environments. ACM SIGGRAPH Comput Graph 29(2):10–12
Parker SG, Miller M, Hansen CD, Johnson CR (1998) An integrated problem solving environment: the SCIRun computational steering system. In: Proceedings of the 31st Hawaii International Conference on System Sciences, HICSS-31, pp 147–156
Altintas I, Berkley C, Jaeger E, Jones M, Ludäscher B, Mock S (2004) Kepler: an extensible system for design and execution of scientific workflows. In: 16th international conference on Scientific and Statistical Database Management, SSDBM. IEEE Computer Society, New York, pp 423–424
Taylor I, Shields M, Wang I, Harrison A (2005) Visual grid workflow in triana. Journal of Grid Computing 3(34):153–169
Callahan S, Freire J, Santos E, Scheidegger C, Silva C, Vo H (2006) Managing the evolution of dataflows with vis Trails. In: IEEE workshop on workflow and data flow for scientific applications, SciFlow
Maechling P, Deelman E, Zhao L, Graves R, Mehta G, Gupta N, Mehringer J, Kesselman C, Callaghan S, Okaya D, Francoeur H, Gupta V, Cui Y, Vahia K, Jordan T, Field E (2007) Workflows for e-Science. Springer, New York, pp 143–166. Ch. SCEC CyberShake workflows – automating probabilistic seismic hazard analysis calculations
Knight K, Marcu D (2005) Machine translation in the Year 2004. In: Proceedings of the 2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP, vol 5. IEEE Computer Society, New York, pp 965–968
ISO/IEC 15909-1, High-level Petri nets – Part 1: concepts, definitions and graphical notation, 2004
Fowler M, Scott K (1997) UML distilled. Addison-Wesley, Reading
Fletcher T, Ltd C, Furniss P, Green A, Haugen R, BPEL and business transaction management: choreology submission to OASIS WS-BPEL Technical Committee, published on web
Shirasuna S, XBaya workflow composer. http://www.extreme.indiana.edu/xgws/xbaya
Deelman E, Singh G, Su M-H, Blythe J, Gil Y, Kesselman C, Mehta G, Vahi K, Berriman GB, Good J, Laity A, Jacob JC, Katz D (2005) Pegasus: a framework for mapping complex scientific workflows onto distributed systems. Scientific Programming Journal 13(3):219–237
Henderson R, Tweten D, Portable batch system: external reference specification, Ames Research Center, Moffett Field
Zhou S (1992) LSF: load sharing in large-scale heterogeneous distributed systems. In: Proceedings Workshop on Cluster Computing, pp 1995–1996
Litzkow M, Livny M, Mutka M (1988) Condor – a hunter of idle workstations. In: Proceedings of the 8th international conference on distributed computing systems. IEEE Computer Society, New York, pp 104–111
The Globus Alliance. See web site at: http://www.globus.org
Deltacloud: https://deltacloud.apache.org/
Taylor I, Shields M, Wang I, Harrison A (2005) Visual grid workflow in triana. J Grid Comput 3(3–4):153–169
Czajkowski K, DF Ferguson, Foster I, Frey J, Graham S, Sedukhin I, Snelling D, Tuecke S, Vambenepe W (2004) The WS-resource framework, Technical Report, The Globus Alliance
Simmhan YL, Plale B, Gannon D (2006) Performance evaluation of the karma provenance framework for scientific workflows, in: International Provenance and Annotation Workshop, IPAW. Springer, Berlin
Miles S, Groth P, Deelman E, Vahi K, Mehta G, Moreau L (2008) Provenance: the bridge between experiments and data. Comput Sci Eng 10(3):38–46
Zhao Y, Fei X, Raicu I, Lu S (2011) Opportunities and challenges in running scientific workflows on the cloud. International conference on cyber-enabled distributed computing and knowledge discovery, Beijing, pp 455–462
Senturk IF, Balakrishnan P, Abu-Doleh A, Kaya K, Qutaibah M, Ãœmit V (2016) A resource provisioning framework for bioinformatics applications in multi-cloud environments. Future generation computer systems, Elsevier, (Accepted to Publish impact factor-2.64): doi:10.1016/j.future.2016.06.008
Goecks J, Nekrutenko A, Taylor J, Team TG (2010) Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biol 11(8) http://dx.doi.org/10.1186/gb-2010-11-8-r86, R86+
BioCloud. URL http://confluence.qu.edu.qa/display/KINDI/BioCloud
Alberich. URL https://github.com/aeolus-incubator/alberich
Woyach JA, Furman RR, Liu T-M, Ozer HG, Zapatka M, Ruppert AS, Xue L, Li DH-H, Steggerda SM, Versele M, Dave SS, Zhang J, Yilmaz AS, Jaglowski SM, Blum KA, Lozanski A, Lozanski G, James DF, Barrientos JC, Lichter P, Stilgenbauer S, Buggy JJ, Chang BY, Johnson AJ, Byrd JC (2014) Resistance mechanisms for the bruton’s tyrosine kinase inhibitor ibrutinib. New Engl J Med 370(24):2286–2294. http://dx.doi.org/10.1056/NEJMoa1400029, pMID: 24869598
Li H, Durbin R (2009) Fast and accurate short read alignment with burrows–wheeler transform. Bioinformatics 25(14):1754–1760
Bolger AM, Lohse M, Usadel B, Trimmomatic: a flexible trimmer for illumina sequence data, Bioinformatics. http://dx.doi.org/10.1093/bioinformatics/btu170
Robertson G, Schein J, Chiu R, Corbett R, Field M, Jackman SD, Mungall K, Lee S, Okada HM, Qian JQ, Griffith M, Raymond A, Thiessen N, Cezard T, Butterfield YS, Newsome R, Chan SK, She R, Varhol R, Kamoh B, Prabhu A-L, Tam A, Zhao Y, Moore RA, Hirst M, Marra MA, Jones SJM, Hoodless PA, Birol I (2010) De-novo assembly and analysis of RNA-seq data. Nat Methods 7(11):912
Xie Y, Wu G, Tang J, Luo R, Patterson J, Liu S, Huang W, He G, Gu S, Li S, Zhou X, Lam T-W, Li Y, Xu X, Wong GK-S, Wang J, SOAPdenovo-Trans: De novo transcriptome assembly with short RNA-Seq reads. Bioinformatics. http://dx.doi.org/10.1093/bioinformatics/btu077
Peng Y, Leung HCM, Yiu S-M, Lv M-J, Zhu X-G, Chin FYL (2013) IDBAtran: a more robust de novo de bruijn graph assembler for transcriptomes with uneven expression levels. Bioinformatics 29(13):i326–i334. http://dx.doi.org/10.1093/bioinformatics/btt219
Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L, Raychowdhury R, Zeng Q, Chen Z, Mauceli E, Hacohen N, Gnirke A, Rhind N, di Palma F, Birren BW, Nusbaum C, Lindblad-Toh K, Friedman N, Regev A (2011) Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol 29(7):652
Schulz MH, Zerbino DR, Vingron M, Birney E, Oases: Robust de novo rnaseq assembly across the dynamic range of expression levels, Bioinformatics. http://dx.doi.org/10.1093/bioinformatics/bts094
Pertea G, Huang X, Liang F, Antonescu V, Sultana R, Karamycheva S, Lee Y, White J, Cheung F, Parvizi B, Tsai J, Quackenbush J (2003) TIGR gene indices clustering tools (TGICL): a software system for fast clustering of large EST datasets. Bioinformatics 19(5):651–652. http://dx.doi.org/10.1093/bioinformatics/btg034
Zheng Zhang LW, Scott S, Miller W (2000) A greedy algorithm for aligning DNA sequences. Comput Biol 7:203–214. http://dx.doi.org/10.1089/10665270050081478
Huang X, Madan A (1999) Cap3: a DNA sequence assembly program. Genome Res 9:868–877. http://dx.doi.org/10.1089/10665270050081478
De Wit P, Pespeni MH, Ladner JT, Barshis DJ, Seneca F, Jaris H, Therkildsen NO, Morikawa M, Palumbi SR (2012) The simple fool’s guide to population genomics via RNA-Seq: An introduction to high-throughput sequencing data analysis. Mol Ecol Res 12(6):1058–1067. http://dx.doi.org/10.1111/1755-0998.12003
Deelman E, Gannon D, Shields M, Taylor I (2009) Workflows and e-science: An overview of workflow system features and capabilities. Futur Gener Comput Syst 25(5):528–540
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this chapter
Cite this chapter
Kousalya, G., Balakrishnan, P., Pethuru Raj, C. (2017). Workflow Integration and Orchestration, Opportunities and the Challenges. In: Automated Workflow Scheduling in Self-Adaptive Clouds. Computer Communications and Networks. Springer, Cham. https://doi.org/10.1007/978-3-319-56982-6_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-56982-6_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-56981-9
Online ISBN: 978-3-319-56982-6
eBook Packages: Computer ScienceComputer Science (R0)