Skip to main content

Combining a High-Throughput Bioinformatics Grid and Bioinformatics Web Services

  • Conference paper
Distributed, High-Performance and Grid Computing in Computational Biology (GCCB 2007)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 4360))

Included in the following conference series:

  • 315 Accesses

Abstract

We have created a high-throughput grid for biological sequence analysis, which is freely accessible via bioinformatics Web services. The system allows the execution of computationally intensive sequence alignment algorithms, such as Smith-Waterman or hidden Markov model searches, with speedups up to three orders of magnitude over single-CPU installations. Users around the world can now process highly sensitive sequence alignments with a turnaround time similar to that of BLAST tools. The grid combines high-throughput accelerators at two bioinformatics facilities in different geographical locations. The tools include TimeLogic DeCypher boards, a Paracel GeneMatcher2 accelerator, and Paracel BlastMachines. The Sun N1 Grid Engine software performs distributed resource management. Clients communicate with the grid through existing open BioMOBY Web services infrastructure. We also illustrate bioinformatics grid strategies for distributed load balancing, and report several nontrivial technical solutions that may serve as templates for adaptation by other bioinformatics groups.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Smith, T.F., Waterman, M.S.: Identification of Common Molecular Subsequences. J. Mol. Biol. 147, 195–197 (1981)

    Article  Google Scholar 

  2. Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J.: Basic Local Alignment Search Tool. J. Mol. Biol. 215, 403–410 (1990)

    Google Scholar 

  3. Rabiner, L.R.: A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. In: Proc. of IEEE 77, pp. 257–286 (1989)

    Google Scholar 

  4. Gaasterland, T., Sensen, C.W.: Fully Automated Genome Analysis That Reflects User Needs and Preferences: A Detailed Introduction to the MAGPIE System Architecture. Biochimie 78, 302–310 (1996)

    Article  Google Scholar 

  5. Stein, L.: Creating a Bioinformatics Nation. Nature 417, 119–120 (2002)

    Article  Google Scholar 

  6. Chicurel, M.: Bioinformatics: Bringing It All Together. Nature 419, 751, 753, 755 (2002)

    Google Scholar 

  7. Foster, I., Kesselman, C.: The Grid: Blueprint for a New Computing Infrastructure. Morgan Kaufmann, San Francisco (1999)

    Google Scholar 

  8. Czajkowski, K., Fitzgerald, S., Foster, I., Kesselman, C.: Grid Information Services for Distributed Resource Sharing. In: 10th IEEE International Symposium on High Performance Distributed Computing, pp. 181–184. IEEE Press, New York (2001)

    Chapter  Google Scholar 

  9. Curbera, F., Duftler, M., Khalaf, R., Mukhi, N., Nagy, W., Weerawarana, S.: Unraveling the Web Services Web - An Introduction to SOAP, WSDL, and UDDI. IEEE Internet Computing 6, 86–93 (2002)

    Article  Google Scholar 

  10. Stevens, R.D., Robinson, A.J., Goble, C.A.: myGrid: Personalised Bioinformatics on the Information Grid. Bioinformatics Suppl. 1, i302–i304 (2003)

    Article  Google Scholar 

  11. Goble, C., Stevens, R., Ng, G., Bechhofer, S., Paton, N., Baker, P., Peim, M., Brass, A.: Transparent Access to Multiple Bioinformatics Information Sources. IBM Systems Journal 40, 532–551 (2001)

    Google Scholar 

  12. Hass, L., Schwarz, P.M., Kodali, P., Kotlar, E., Rice, J.E., Swope, W.C.: DiscoveryLink: A System for Integrated Access to Life Sciences Data Sources. IBM Systems Journal 40, 489–511 (2001)

    Article  Google Scholar 

  13. Foster, I., Kesselman, C., Nick, J., Tuecke, S.: The Physiology of the Grid: an Open Grid Services Architecture for Distributed Systems Integration. Technical report, Global Grid Forum (2002)

    Google Scholar 

  14. Wilkinson, M.D., Links, M.: BioMOBY: an Open-source Biological Web Services Proposal. Bioinformatics 3, 331–341 (2002)

    Article  Google Scholar 

  15. Sun N1 Grid Engine 6, http://www.sun.com/software/gridware

    Google Scholar 

  16. MOBY Tools, http://mobycentral.icapture.ubc.ca/applets

    Google Scholar 

  17. The Common Gateway Interface, http://hoohoo.ncsa.uiuc.edu/cgi

    Google Scholar 

  18. GNU Wget, http://www.gnu.org/software/wget

    Google Scholar 

  19. BioMOBY in Java, http://biomoby.open-bio.org/CVS_CONTENT/moby-live/Java/docs

    Google Scholar 

  20. Oinn, T., Addis, M., Ferris, J., Marvin, D., Senger, M., Greenwood, M., Carver, T., Glover, K., Pocock, M.R., Wipat, A., Li, P.: Taverna: a Tool for the Composition and Enactment of Bioinformatics Workflows. Bioinformatics 20, 3045–3054 (2004)

    Article  Google Scholar 

  21. Turinsky, A.L., Ah-Seng, A.C., Gordon, P.M.K., Stromer, J.N., Taschuk, M.L., Xu, E.W., Sensen, C.W.: Bioinformatics Visualization and Integration with Open Standards: The Bluejay Genomic Browser. Silico Biol. 5, 187–198 (2005)

    Google Scholar 

  22. MOBY Clients, http://biomoby.open-bio.org/index.php/moby-clients

    Google Scholar 

  23. National Center for Biotechnology Information, http://www.ncbi.nlm.nih.gov

    Google Scholar 

  24. Gentzsch, W.: Grid Computing: A Vendor’s Vision. In: Proc. of CCGrid, pp. 290–295 (2002)

    Google Scholar 

  25. Berman, F., Fox, G., Hey, T.: Grid Computing: Making the Global Infrastructure a Reality. Wiley, London (2003)

    Google Scholar 

  26. Foster, I., Kesselman, C.: Globus: A Metacomputing Infrastructure Toolkit. Supercomputer Applications 11, 115–128 (1997)

    Article  Google Scholar 

  27. Frey, J., Tannenbaum, T., Livny, M., Foster, I.T., Tuecke, S.: Condor-G: A Computation Management Agent for Multi-Institutional Grids. Cluster Computing 5, 237–246 (2002)

    Article  Google Scholar 

  28. Buyya, R., Abramson, D., Giddy, J.: Nimrod/G: An Architecture for a Resource Management and Scheduling System in a Global Computational Grid. In: Proc. of HPC ASIA, pp. 283–289 (2000)

    Google Scholar 

  29. Gannon, D., Bramley, R., Fox, G., Smallen, S., Rossi, A., Ananthakrishnan, R., Bertrand, F., Chiu, K., Farrellee, M., Govindaraju, M., Krishnan, S., Ramakrishnan, L., Simmhan, Y., Slominski, A., Ma, Y., Olariu, C., Rey-Cenvaz, N.: Programming the Grid: Distributed Software Components, P2P and Grid Web Services for Scientific Applications. J. Cluster Computing 5, 325–336 (2002)

    Article  Google Scholar 

  30. Lord, P., Bechhofer, S., Wilkinson, M.D., Schiltz, G., Gessler, D., Hull, D., Goble, C., Stein, L.: Applying Semantic Web Services to Bioinformatics: Experiences Gained, Lessons Learnt. In: Proc. of 3rd Semantic Web Conference, pp. 350–364 (2004)

    Google Scholar 

  31. Rocco, D., Critchlow, T.: Automatic Discovery and Classification of Bioinformatics Web Sources. Bioinformatics 19, 1927–1933 (2003)

    Article  Google Scholar 

  32. Kelly, N., Jithesh, P.V., Simpson, D.R., Donachy, P., Harmer, T.J., Perrott, R.H., Johnston, J., Kerr, P., McCurley, M., McKee, S.: Bioinformatics Data and the Grid: The GeneGrid Data Manager. In: Proc. of UK e-Science All Hands Meeting, pp. 571–578 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Werner Dubitzky Assaf Schuster Peter M. A. Sloot Michael Schroeder Mathilde Romberg

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Wang, C., Gordon, P.M.K., Turinsky, A.L., Burgess, J., Dalton, T., Sensen, C.W. (2007). Combining a High-Throughput Bioinformatics Grid and Bioinformatics Web Services. In: Dubitzky, W., Schuster, A., Sloot, P.M.A., Schroeder, M., Romberg, M. (eds) Distributed, High-Performance and Grid Computing in Computational Biology. GCCB 2007. Lecture Notes in Computer Science(), vol 4360. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69968-2_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-69968-2_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-69841-8

  • Online ISBN: 978-3-540-69968-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics