Combining a High-Throughput Bioinformatics Grid and Bioinformatics Web Services

Wang, Chunyan; Gordon, Paul M. K.; Turinsky, Andrei L.; Burgess, Jason; Dalton, Terry; Sensen, Christoph W.

doi:10.1007/978-3-540-69968-2_1

Chunyan Wang¹,
Paul M. K. Gordon¹,
Andrei L. Turinsky¹,
Jason Burgess²,
Terry Dalton² &
…
Christoph W. Sensen¹

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 4360))

Included in the following conference series:

International Workshop on Grid Computing in Computational Biology

315 Accesses

Abstract

We have created a high-throughput grid for biological sequence analysis, which is freely accessible via bioinformatics Web services. The system allows the execution of computationally intensive sequence alignment algorithms, such as Smith-Waterman or hidden Markov model searches, with speedups up to three orders of magnitude over single-CPU installations. Users around the world can now process highly sensitive sequence alignments with a turnaround time similar to that of BLAST tools. The grid combines high-throughput accelerators at two bioinformatics facilities in different geographical locations. The tools include TimeLogic DeCypher boards, a Paracel GeneMatcher2 accelerator, and Paracel BlastMachines. The Sun N1 Grid Engine software performs distributed resource management. Clients communicate with the grid through existing open BioMOBY Web services infrastructure. We also illustrate bioinformatics grid strategies for distributed load balancing, and report several nontrivial technical solutions that may serve as templates for adaptation by other bioinformatics groups.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Smith, T.F., Waterman, M.S.: Identification of Common Molecular Subsequences. J. Mol. Biol. 147, 195–197 (1981)
Article Google Scholar
Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J.: Basic Local Alignment Search Tool. J. Mol. Biol. 215, 403–410 (1990)
Google Scholar
Rabiner, L.R.: A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. In: Proc. of IEEE 77, pp. 257–286 (1989)
Google Scholar
Gaasterland, T., Sensen, C.W.: Fully Automated Genome Analysis That Reflects User Needs and Preferences: A Detailed Introduction to the MAGPIE System Architecture. Biochimie 78, 302–310 (1996)
Article Google Scholar
Stein, L.: Creating a Bioinformatics Nation. Nature 417, 119–120 (2002)
Article Google Scholar
Chicurel, M.: Bioinformatics: Bringing It All Together. Nature 419, 751, 753, 755 (2002)
Google Scholar
Foster, I., Kesselman, C.: The Grid: Blueprint for a New Computing Infrastructure. Morgan Kaufmann, San Francisco (1999)
Google Scholar
Czajkowski, K., Fitzgerald, S., Foster, I., Kesselman, C.: Grid Information Services for Distributed Resource Sharing. In: 10th IEEE International Symposium on High Performance Distributed Computing, pp. 181–184. IEEE Press, New York (2001)
Chapter Google Scholar
Curbera, F., Duftler, M., Khalaf, R., Mukhi, N., Nagy, W., Weerawarana, S.: Unraveling the Web Services Web - An Introduction to SOAP, WSDL, and UDDI. IEEE Internet Computing 6, 86–93 (2002)
Article Google Scholar
Stevens, R.D., Robinson, A.J., Goble, C.A.: myGrid: Personalised Bioinformatics on the Information Grid. Bioinformatics Suppl. 1, i302–i304 (2003)
Article Google Scholar
Goble, C., Stevens, R., Ng, G., Bechhofer, S., Paton, N., Baker, P., Peim, M., Brass, A.: Transparent Access to Multiple Bioinformatics Information Sources. IBM Systems Journal 40, 532–551 (2001)
Google Scholar
Hass, L., Schwarz, P.M., Kodali, P., Kotlar, E., Rice, J.E., Swope, W.C.: DiscoveryLink: A System for Integrated Access to Life Sciences Data Sources. IBM Systems Journal 40, 489–511 (2001)
Article Google Scholar
Foster, I., Kesselman, C., Nick, J., Tuecke, S.: The Physiology of the Grid: an Open Grid Services Architecture for Distributed Systems Integration. Technical report, Global Grid Forum (2002)
Google Scholar
Wilkinson, M.D., Links, M.: BioMOBY: an Open-source Biological Web Services Proposal. Bioinformatics 3, 331–341 (2002)
Article Google Scholar
Sun N1 Grid Engine 6, http://www.sun.com/software/gridware
Google Scholar
MOBY Tools, http://mobycentral.icapture.ubc.ca/applets
Google Scholar
The Common Gateway Interface, http://hoohoo.ncsa.uiuc.edu/cgi
Google Scholar
GNU Wget, http://www.gnu.org/software/wget
Google Scholar
BioMOBY in Java, http://biomoby.open-bio.org/CVS_CONTENT/moby-live/Java/docs
Google Scholar
Oinn, T., Addis, M., Ferris, J., Marvin, D., Senger, M., Greenwood, M., Carver, T., Glover, K., Pocock, M.R., Wipat, A., Li, P.: Taverna: a Tool for the Composition and Enactment of Bioinformatics Workflows. Bioinformatics 20, 3045–3054 (2004)
Article Google Scholar
Turinsky, A.L., Ah-Seng, A.C., Gordon, P.M.K., Stromer, J.N., Taschuk, M.L., Xu, E.W., Sensen, C.W.: Bioinformatics Visualization and Integration with Open Standards: The Bluejay Genomic Browser. Silico Biol. 5, 187–198 (2005)
Google Scholar
MOBY Clients, http://biomoby.open-bio.org/index.php/moby-clients
Google Scholar
National Center for Biotechnology Information, http://www.ncbi.nlm.nih.gov
Google Scholar
Gentzsch, W.: Grid Computing: A Vendor’s Vision. In: Proc. of CCGrid, pp. 290–295 (2002)
Google Scholar
Berman, F., Fox, G., Hey, T.: Grid Computing: Making the Global Infrastructure a Reality. Wiley, London (2003)
Google Scholar
Foster, I., Kesselman, C.: Globus: A Metacomputing Infrastructure Toolkit. Supercomputer Applications 11, 115–128 (1997)
Article Google Scholar
Frey, J., Tannenbaum, T., Livny, M., Foster, I.T., Tuecke, S.: Condor-G: A Computation Management Agent for Multi-Institutional Grids. Cluster Computing 5, 237–246 (2002)
Article Google Scholar
Buyya, R., Abramson, D., Giddy, J.: Nimrod/G: An Architecture for a Resource Management and Scheduling System in a Global Computational Grid. In: Proc. of HPC ASIA, pp. 283–289 (2000)
Google Scholar
Gannon, D., Bramley, R., Fox, G., Smallen, S., Rossi, A., Ananthakrishnan, R., Bertrand, F., Chiu, K., Farrellee, M., Govindaraju, M., Krishnan, S., Ramakrishnan, L., Simmhan, Y., Slominski, A., Ma, Y., Olariu, C., Rey-Cenvaz, N.: Programming the Grid: Distributed Software Components, P2P and Grid Web Services for Scientific Applications. J. Cluster Computing 5, 325–336 (2002)
Article Google Scholar
Lord, P., Bechhofer, S., Wilkinson, M.D., Schiltz, G., Gessler, D., Hull, D., Goble, C., Stein, L.: Applying Semantic Web Services to Bioinformatics: Experiences Gained, Lessons Learnt. In: Proc. of 3rd Semantic Web Conference, pp. 350–364 (2004)
Google Scholar
Rocco, D., Critchlow, T.: Automatic Discovery and Classification of Bioinformatics Web Sources. Bioinformatics 19, 1927–1933 (2003)
Article Google Scholar
Kelly, N., Jithesh, P.V., Simpson, D.R., Donachy, P., Harmer, T.J., Perrott, R.H., Johnston, J., Kerr, P., McCurley, M., McKee, S.: Bioinformatics Data and the Grid: The GeneGrid Data Manager. In: Proc. of UK e-Science All Hands Meeting, pp. 571–578 (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Sun Center of Excellence for Visual Genomics, University of Calgary, HS 1150, 3330 Hospital Dr. NW, Calgary, Alberta, T2N 4N1, Canada
Chunyan Wang, Paul M. K. Gordon, Andrei L. Turinsky & Christoph W. Sensen
National Research Council Institute for Marine Biosciences, 1411 Oxford Street, Halifax, Nova Scotia, B3H 3Z1, Canada
Jason Burgess & Terry Dalton

Authors

Chunyan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Paul M. K. Gordon
View author publications
You can also search for this author in PubMed Google Scholar
Andrei L. Turinsky
View author publications
You can also search for this author in PubMed Google Scholar
Jason Burgess
View author publications
You can also search for this author in PubMed Google Scholar
Terry Dalton
View author publications
You can also search for this author in PubMed Google Scholar
Christoph W. Sensen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Werner Dubitzky Assaf Schuster Peter M. A. Sloot Michael Schroeder Mathilde Romberg

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, C., Gordon, P.M.K., Turinsky, A.L., Burgess, J., Dalton, T., Sensen, C.W. (2007). Combining a High-Throughput Bioinformatics Grid and Bioinformatics Web Services. In: Dubitzky, W., Schuster, A., Sloot, P.M.A., Schroeder, M., Romberg, M. (eds) Distributed, High-Performance and Grid Computing in Computational Biology. GCCB 2007. Lecture Notes in Computer Science(), vol 4360. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69968-2_1

Download citation

DOI: https://doi.org/10.1007/978-3-540-69968-2_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-69841-8
Online ISBN: 978-3-540-69968-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics