Skip to main content

Galaxy: A Gateway to Tools in e-Science

  • Chapter
  • First Online:

Part of the book series: Computer Communications and Networks ((CCN))

Abstract

e-Science focuses on the use of computational tools and resources to analyze large scientific datasets. Performing these analyses often requires running a variety of computational tools specific to a given scientific domain. This places a significant burden on individual researchers for whom simply running these tools may be prohibitively difficult, let alone combining tools into a complete analysis, or acquiring data and appropriate computational resources. This limits the productivity of individual researchers and represents a significant barrier to potential scientific discovery. In order to alleviate researchers from such unnecessary complexities and promote more robust science, we have developed a tool integration framework called Galaxy; Galaxy abstracts individual tools behind a consistent and easy-to-use web interface to enable advanced data analysis that requires no informatics expertise. Furthermore, Galaxy facilitates easy addition of developed tools, thus supporting tool developers, as well as transparent and reproducible communication of computationally intensive analyses. Recently, we have enabled trivial deployment of complete a Galaxy solution on aggregated infrastructures, including cloud computing providers.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://galaxyproject.org

  2. 2.

    http://usegalaxy.org/

  3. 3.

    http://usegalaxy.org/u/aun1/p/windshield-splatter

  4. 4.

    http://mercurial.selenic.com/

  5. 5.

    http://bitbucket.org/

  6. 6.

    http://www.sqlalchemy.org/

  7. 7.

    http://main.g2.bx.psu.edu/u/fischerlab/h/sm1186088

  8. 8.

    http://www.rabbitmq.com/

  9. 9.

    http://usegalaxy.org/cloud

  10. 10.

    http://galaxy.fml.tuebingen.mpg.de/

References

  1. NCBI. (2009, February 3). GenBank Statistics. Available: http://www.ncbi.nlm.nih.gov/Genbank/genbankstats.html

  2. E. Huedo, R. S. Montero, and I. M. Llorente, “A Framework for Adaptive Execution on Grids,” Journal of Software - Practice and Experience, vol. 34, issue 7, pp. 631–651, June 2004.

    Article  Google Scholar 

  3. E. Afgan and P. Bangalore, “Dynamic BLAST – a Grid Enabled BLAST,” International Journal of Computer Science and Network Security (IJCSNS), vol. 9, issue 4, pp. 149–157, April 2009.

    Google Scholar 

  4. D. Blankenberg, J. Taylor, I. Schenck, J. He, Y. Zhang, M. Ghent, N. Veeraraghavan, I. Albert, W. Miller, K. Makova, R. Hardison, and A. Nekrutenko, “A framework for collaborative analysis of ENCODE data: making large-scale analyses biologist-friendly,” Genome Research, vol. 17, issue 6, pp. 960–964, Jun 2007.

    Article  Google Scholar 

  5. J. Taylor, I. Schenck, D. Blankenberg, and A. Nekrutenko, “Using Galaxy to perform large-scale interactive data analyses,” Current Protocols in Bioinformatics, vol. 19, pp. 10.5.1–10.5.25, Sep 2007.

    Google Scholar 

  6. M. Reich, T. Liefeld, J. Gould, J. Lerner, P. Tamayo, and J. Mesirov, “GenePattern 2.0,” Nature genetics, vol. 38, issue 5, pp. 500–501, 2006.

    Article  Google Scholar 

  7. B. Langmead, C. Trapnell, M. Pop, and S. Salzberg, “Ultrafast and memory-efficient alignment of short DNA sequences to the human genome,” Genome biology, vol. 10, issue 3, p. 25, Mar 4 2009.

    Article  Google Scholar 

  8. P. Kosakovsky, S. Wadhawan, F. Chiaromonte, G. Ananda, W. Chung, J. Taylor, and A. Nekrutenko, “Windshield splatter analysis with the Galaxy metagenomic pipeline,” Genome Research, vol. 19, issue 11, Oct 9 2009.

    Google Scholar 

  9. R. Buyya, C. S. Yeo, S. Venugopal, J. Broberg, and I. Brandic, “Cloud computing and emerging IT platforms: Vision, hype, and reality for delivering computing as the 5th utility,” Future Generation Computer Systems, vol. 25, issue 6, pp. 599–616, June 2009.

    Article  Google Scholar 

  10. M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. Katz, A. Konwinski, G. Lee, D. Patterson, A. Rabkin, I. Stoica, and M. Zaharia, “Above the Clouds: A Berkeley View of Cloud Computing,” University of California at Berkeley UCB/EECS-2009-28, February 10 2009.

    Google Scholar 

  11. J. Nielsen, Designing web usability, 1st ed.: Peachpit Press, 1999.

    Google Scholar 

  12. S. Peleg, F. Sananbenesi, A. Zovoilis, S. Burkhardt, S. Bahari-Javan, R. Agis-Balboa, P. Cota, J. Wittnam, A. Gogol-Doering, and L. Opitz, “Altered Histone Acetylation Is Associated with Age-Dependent Memory Impairment in Mice,” Science, vol. 328, issue 5979, pp. 753–756, 2010.

    Article  Google Scholar 

  13. S. Kosakovsky Pond, S. Wadhawan, F. Chiaromonte, G. Ananda, W. Chung, J. Taylor, and A. Nekrutenko, “Windshield splatter analysis with the Galaxy metagenomic pipeline,” Genome Research, vol. 19, issue 11, pp. 2144–2153, 2009.

    Article  Google Scholar 

  14. K. Gaulton, T. Nammo, L. Pasquali, J. Simon, P. Giresi, M. Fogarty, T. Panhuis, P. Mieczkowski, A. Secchi, and D. Bosco, “A map of open chromatin in human pancreatic islets,” Nature genetics, vol. 42, issue 3, pp. 255–259, 2010.

    Article  Google Scholar 

  15. R. Kikuchi, S. Yagi, H. Kusuhara, S. Imai, Y. Sugiyama, and K. Shiota, “Genome-wide analysis of epigenetic signatures for kidney-specific transporters,” Kidney International, 2010.

    Google Scholar 

  16. J. Parkhill, E. Birney, and P. Kersey, “Genomic information infrastructure after the deluge,” Genome biology, vol. 11, issue 7, p. 402, 2010.

    Article  Google Scholar 

  17. The Grid: Blueprint for a New Computing Infrastructure, 1st ed.: Morgan Kaufmann Publishers, 1998.

    Google Scholar 

  18. K. Keahey and T. Freeman, “Contextualization: Providing one-click virtual clusters,” in IEEE International Conference on eScience, Indianapolis, IN, 2008, pp. 301–308.

    Google Scholar 

  19. D. Nurmi, R. Wolski, C. Grzegorczyk, G. Obertelli, S. Soman, L. Youseff, and D. Zagorodnov, “The eucalyptus open-source cloud-computing system,” in Cloud Computing and Its Applications, Shanghai, China, 2008, pp. 1–5.

    Google Scholar 

  20. I. M. Llorente, R. Moreno-Vozmediano, and R. S. Montero, “Cloud Computing for On-Demand Grid Resource Provisioning,” Advances in Parallel Computing, vol. 18, pp. 177–191, 2009.

    Google Scholar 

  21. K. Keahey, I. Foster, T. Freeman, and X. Zhang, “Virtual Workspaces: Achieving Quality of Service and Quality of Life in the Grid,” Scientific Programming Journal, Special Issue: Dynamic Grids and Worldwide Computing, vol. 13, issue 4, pp. 265–276, 2005.

    Google Scholar 

  22. H. Nishimura, N. Maruyama, and S. Matsuoka, “Virtual clusters on the fly-fast, scalable, and flexible installation,” in CCGrid Rio de Janeiro, Brazil, 2007, pp. 549–556.

    Google Scholar 

  23. A. W. Group, “AMQP - A General-Purpose Middleware Standard,” ed, p. 291.

    Google Scholar 

  24. A. Siepel, A. Farmer, A. Tolopko, M. Zhuang, P. Mendes, W. Beavis, and B. Sobral, “ISYS: a decentralized, component-based approach to the integration of heterogeneous bioinformatics resources,” Bioinformatics, vol. 17, issue 1, pp. 83–94, Aug 14 2001.

    Article  Google Scholar 

  25. S. Subramaniam, “The Biology Workbench--a seamless database and analysis environment for the biologist,” Proteins, vol. 32, issue 1, pp. 1–2, Jul 1 1998.

    Article  Google Scholar 

  26. K. Choi, Y. Ma, J.-H. Choi, and S. Kim, “PLATCOM: a Platform for Computational Comparative Genomics,” Bioinformatics, vol. 21, issue 10, pp. 2514–2516, Feb 24 2005.

    Article  Google Scholar 

  27. T. Etzold and P. Argos, “SRS--an indexing and retrieval tool for flat file data libraries,” Bioinformatics, vol. 9, issue 1, pp. 49–57, 1993.

    Article  Google Scholar 

  28. E. Kawas, M. Senger, and M. D. Wilkinson, “BioMoby extensions to the Taverna workflow management and enactment software,” BMC Bioinformatics, vol. 7, p. 253, 2006.

    Article  Google Scholar 

  29. D. Hull, K. Wolstencroft, R. Stevens, C. Goble, M. R. Pocock, P. Li, and T. Oinn, “Taverna: a tool for building and running workflows of services,” Nucleic Acids Research, vol. 34, issue Web Server issue, pp. W729–32, 2006.

    Article  Google Scholar 

  30. D. Hull, K. Wolstencroft, R. Stevens, C. Goble, M. R. Pocock, P. Li, and T. Oinn, “Taverna: a tool for building and running workflows of services,” Nucleic Acids Research, vol. 34, issue Web Server issue, pp. W729–32, 2006.

    Article  Google Scholar 

  31. S. Pepke, B. Wold, and A. Mortazavi, “Computation for ChIP-seq and RNA-seq studies,” Nature methods, vol. 6, pp. S22–S32, 2009.

    Article  Google Scholar 

  32. B. Moore, “Taking the data center: Power and cooling challenge,” Energy User News, vol. 27, issue 9, p. 20, 2002.

    Google Scholar 

Download references

Acknowledgments

Galaxy is developed by the Galaxy Team: Enis Afgan, Guruprasad Ananda, Dannon Baker, Dan Blankenberg, Ramkrishna Chakrabarty, Nate Coraor, Jeremy Goecks, Greg Von Kuster, Ross Lazarus, Kanwei Li, Anton Nekrutenko, James Taylor, and Kelly Vincent. We thank our many collaborators who support and maintain data warehouses and browsers accessible through Galaxy. Development of the Galaxy framework is supported by NIH grants HG004909 (A.N. and J.T), HG005133 (J.T. and A.N), and HG005542 (J.T. and A.N.), by NSF grant DBI-0850103 (A.N. and J.T) and by funds from the Huck Institutes for the Life Sciences and the Institute for CyberScience at Penn State. Additional funding is provided, in part, under a grant with the Pennsylvania Department of Health using Tobacco Settlement Funds. The Department specifically disclaims responsibility for any analyses, interpretations, or conclusions.

Author information

Authors and Affiliations

Authors

Consortia

Corresponding author

Correspondence to James Taylor .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag London Limited

About this chapter

Cite this chapter

Afgan, E. et al. (2011). Galaxy: A Gateway to Tools in e-Science. In: Yang, X., Wang, L., Jie, W. (eds) Guide to e-Science. Computer Communications and Networks. Springer, London. https://doi.org/10.1007/978-0-85729-439-5_6

Download citation

  • DOI: https://doi.org/10.1007/978-0-85729-439-5_6

  • Published:

  • Publisher Name: Springer, London

  • Print ISBN: 978-0-85729-438-8

  • Online ISBN: 978-0-85729-439-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics