Galaxy: A Gateway to Tools in e-Science

Afgan, Enis; Goecks, Jeremy; Baker, Dannon; Coraor, Nate; Nekrutenko, Anton; Taylor, James

doi:10.1007/978-0-85729-439-5_6

Galaxy: A Gateway to Tools in e-Science

Enis Afgan⁴,
Jeremy Goecks⁴,
Dannon Baker⁴,
Nate Coraor⁵,
The Galaxy Team,
Anton Nekrutenko⁵ &
…
James Taylor⁴

Chapter
First Online: 01 January 2011

845 Accesses
19 Citations

Part of the book series: Computer Communications and Networks ((CCN))

Abstract

e-Science focuses on the use of computational tools and resources to analyze large scientific datasets. Performing these analyses often requires running a variety of computational tools specific to a given scientific domain. This places a significant burden on individual researchers for whom simply running these tools may be prohibitively difficult, let alone combining tools into a complete analysis, or acquiring data and appropriate computational resources. This limits the productivity of individual researchers and represents a significant barrier to potential scientific discovery. In order to alleviate researchers from such unnecessary complexities and promote more robust science, we have developed a tool integration framework called Galaxy; Galaxy abstracts individual tools behind a consistent and easy-to-use web interface to enable advanced data analysis that requires no informatics expertise. Furthermore, Galaxy facilitates easy addition of developed tools, thus supporting tool developers, as well as transparent and reproducible communication of computationally intensive analyses. Recently, we have enabled trivial deployment of complete a Galaxy solution on aggregated infrastructures, including cloud computing providers.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

References

NCBI. (2009, February 3). GenBank Statistics. Available: http://www.ncbi.nlm.nih.gov/Genbank/genbankstats.html
E. Huedo, R. S. Montero, and I. M. Llorente, “A Framework for Adaptive Execution on Grids,” Journal of Software - Practice and Experience, vol. 34, issue 7, pp. 631–651, June 2004.
Article Google Scholar
E. Afgan and P. Bangalore, “Dynamic BLAST – a Grid Enabled BLAST,” International Journal of Computer Science and Network Security (IJCSNS), vol. 9, issue 4, pp. 149–157, April 2009.
Google Scholar
D. Blankenberg, J. Taylor, I. Schenck, J. He, Y. Zhang, M. Ghent, N. Veeraraghavan, I. Albert, W. Miller, K. Makova, R. Hardison, and A. Nekrutenko, “A framework for collaborative analysis of ENCODE data: making large-scale analyses biologist-friendly,” Genome Research, vol. 17, issue 6, pp. 960–964, Jun 2007.
Article Google Scholar
J. Taylor, I. Schenck, D. Blankenberg, and A. Nekrutenko, “Using Galaxy to perform large-scale interactive data analyses,” Current Protocols in Bioinformatics, vol. 19, pp. 10.5.1–10.5.25, Sep 2007.
Google Scholar
M. Reich, T. Liefeld, J. Gould, J. Lerner, P. Tamayo, and J. Mesirov, “GenePattern 2.0,” Nature genetics, vol. 38, issue 5, pp. 500–501, 2006.
Article Google Scholar
B. Langmead, C. Trapnell, M. Pop, and S. Salzberg, “Ultrafast and memory-efficient alignment of short DNA sequences to the human genome,” Genome biology, vol. 10, issue 3, p. 25, Mar 4 2009.
Article Google Scholar
P. Kosakovsky, S. Wadhawan, F. Chiaromonte, G. Ananda, W. Chung, J. Taylor, and A. Nekrutenko, “Windshield splatter analysis with the Galaxy metagenomic pipeline,” Genome Research, vol. 19, issue 11, Oct 9 2009.
Google Scholar
R. Buyya, C. S. Yeo, S. Venugopal, J. Broberg, and I. Brandic, “Cloud computing and emerging IT platforms: Vision, hype, and reality for delivering computing as the 5th utility,” Future Generation Computer Systems, vol. 25, issue 6, pp. 599–616, June 2009.
Article Google Scholar
M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. Katz, A. Konwinski, G. Lee, D. Patterson, A. Rabkin, I. Stoica, and M. Zaharia, “Above the Clouds: A Berkeley View of Cloud Computing,” University of California at Berkeley UCB/EECS-2009-28, February 10 2009.
Google Scholar
J. Nielsen, Designing web usability, 1st ed.: Peachpit Press, 1999.
Google Scholar
S. Peleg, F. Sananbenesi, A. Zovoilis, S. Burkhardt, S. Bahari-Javan, R. Agis-Balboa, P. Cota, J. Wittnam, A. Gogol-Doering, and L. Opitz, “Altered Histone Acetylation Is Associated with Age-Dependent Memory Impairment in Mice,” Science, vol. 328, issue 5979, pp. 753–756, 2010.
Article Google Scholar
S. Kosakovsky Pond, S. Wadhawan, F. Chiaromonte, G. Ananda, W. Chung, J. Taylor, and A. Nekrutenko, “Windshield splatter analysis with the Galaxy metagenomic pipeline,” Genome Research, vol. 19, issue 11, pp. 2144–2153, 2009.
Article Google Scholar
K. Gaulton, T. Nammo, L. Pasquali, J. Simon, P. Giresi, M. Fogarty, T. Panhuis, P. Mieczkowski, A. Secchi, and D. Bosco, “A map of open chromatin in human pancreatic islets,” Nature genetics, vol. 42, issue 3, pp. 255–259, 2010.
Article Google Scholar
R. Kikuchi, S. Yagi, H. Kusuhara, S. Imai, Y. Sugiyama, and K. Shiota, “Genome-wide analysis of epigenetic signatures for kidney-specific transporters,” Kidney International, 2010.
Google Scholar
J. Parkhill, E. Birney, and P. Kersey, “Genomic information infrastructure after the deluge,” Genome biology, vol. 11, issue 7, p. 402, 2010.
Article Google Scholar
The Grid: Blueprint for a New Computing Infrastructure, 1st ed.: Morgan Kaufmann Publishers, 1998.
Google Scholar
K. Keahey and T. Freeman, “Contextualization: Providing one-click virtual clusters,” in IEEE International Conference on eScience, Indianapolis, IN, 2008, pp. 301–308.
Google Scholar
D. Nurmi, R. Wolski, C. Grzegorczyk, G. Obertelli, S. Soman, L. Youseff, and D. Zagorodnov, “The eucalyptus open-source cloud-computing system,” in Cloud Computing and Its Applications, Shanghai, China, 2008, pp. 1–5.
Google Scholar
I. M. Llorente, R. Moreno-Vozmediano, and R. S. Montero, “Cloud Computing for On-Demand Grid Resource Provisioning,” Advances in Parallel Computing, vol. 18, pp. 177–191, 2009.
Google Scholar
K. Keahey, I. Foster, T. Freeman, and X. Zhang, “Virtual Workspaces: Achieving Quality of Service and Quality of Life in the Grid,” Scientific Programming Journal, Special Issue: Dynamic Grids and Worldwide Computing, vol. 13, issue 4, pp. 265–276, 2005.
Google Scholar
H. Nishimura, N. Maruyama, and S. Matsuoka, “Virtual clusters on the fly-fast, scalable, and flexible installation,” in CCGrid Rio de Janeiro, Brazil, 2007, pp. 549–556.
Google Scholar
A. W. Group, “AMQP - A General-Purpose Middleware Standard,” ed, p. 291.
Google Scholar
A. Siepel, A. Farmer, A. Tolopko, M. Zhuang, P. Mendes, W. Beavis, and B. Sobral, “ISYS: a decentralized, component-based approach to the integration of heterogeneous bioinformatics resources,” Bioinformatics, vol. 17, issue 1, pp. 83–94, Aug 14 2001.
Article Google Scholar
S. Subramaniam, “The Biology Workbench--a seamless database and analysis environment for the biologist,” Proteins, vol. 32, issue 1, pp. 1–2, Jul 1 1998.
Article Google Scholar
K. Choi, Y. Ma, J.-H. Choi, and S. Kim, “PLATCOM: a Platform for Computational Comparative Genomics,” Bioinformatics, vol. 21, issue 10, pp. 2514–2516, Feb 24 2005.
Article Google Scholar
T. Etzold and P. Argos, “SRS--an indexing and retrieval tool for flat file data libraries,” Bioinformatics, vol. 9, issue 1, pp. 49–57, 1993.
Article Google Scholar
E. Kawas, M. Senger, and M. D. Wilkinson, “BioMoby extensions to the Taverna workflow management and enactment software,” BMC Bioinformatics, vol. 7, p. 253, 2006.
Article Google Scholar
D. Hull, K. Wolstencroft, R. Stevens, C. Goble, M. R. Pocock, P. Li, and T. Oinn, “Taverna: a tool for building and running workflows of services,” Nucleic Acids Research, vol. 34, issue Web Server issue, pp. W729–32, 2006.
Article Google Scholar
D. Hull, K. Wolstencroft, R. Stevens, C. Goble, M. R. Pocock, P. Li, and T. Oinn, “Taverna: a tool for building and running workflows of services,” Nucleic Acids Research, vol. 34, issue Web Server issue, pp. W729–32, 2006.
Article Google Scholar
S. Pepke, B. Wold, and A. Mortazavi, “Computation for ChIP-seq and RNA-seq studies,” Nature methods, vol. 6, pp. S22–S32, 2009.
Article Google Scholar
B. Moore, “Taking the data center: Power and cooling challenge,” Energy User News, vol. 27, issue 9, p. 20, 2002.
Google Scholar

Download references

Acknowledgments

Galaxy is developed by the Galaxy Team: Enis Afgan, Guruprasad Ananda, Dannon Baker, Dan Blankenberg, Ramkrishna Chakrabarty, Nate Coraor, Jeremy Goecks, Greg Von Kuster, Ross Lazarus, Kanwei Li, Anton Nekrutenko, James Taylor, and Kelly Vincent. We thank our many collaborators who support and maintain data warehouses and browsers accessible through Galaxy. Development of the Galaxy framework is supported by NIH grants HG004909 (A.N. and J.T), HG005133 (J.T. and A.N), and HG005542 (J.T. and A.N.), by NSF grant DBI-0850103 (A.N. and J.T) and by funds from the Huck Institutes for the Life Sciences and the Institute for CyberScience at Penn State. Additional funding is provided, in part, under a grant with the Pennsylvania Department of Health using Tobacco Settlement Funds. The Department specifically disclaims responsibility for any analyses, interpretations, or conclusions.

Author information

Authors and Affiliations

Department of Biology and Department of Mathematics & Computer Science, Emory University, Druid Hills, GA, USA
Enis Afgan, Jeremy Goecks, Dannon Baker & James Taylor
Huck Institutes of the Life Sciences and Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, USA
Nate Coraor & Anton Nekrutenko

Authors

Enis Afgan
View author publications
You can also search for this author in PubMed Google Scholar
Jeremy Goecks
View author publications
You can also search for this author in PubMed Google Scholar
Dannon Baker
View author publications
You can also search for this author in PubMed Google Scholar
Nate Coraor
View author publications
You can also search for this author in PubMed Google Scholar
Anton Nekrutenko
View author publications
You can also search for this author in PubMed Google Scholar
James Taylor
View author publications
You can also search for this author in PubMed Google Scholar

Consortia

The Galaxy Team

Corresponding author

Correspondence to James Taylor .

Editor information

Editors and Affiliations

Reading e-Science Centre, Harry Pitt Building, University of Reading, Earley Gate, Whiteknights 3, Reading, RG6 6AL, United Kingdom
Xiaoyu Yang
Pervasive Technology Institute, Indiana University, East 10th Street 2719, Bloomington, 47408, Indiana, USA
Lizhe Wang
Fac. Professional Studies, School of Computing, Thames Valley University, St. Mary's Road TC372, Ealing, London, W5 5RF, United Kingdom
Wei Jie

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Afgan, E. et al. (2011). Galaxy: A Gateway to Tools in e-Science. In: Yang, X., Wang, L., Jie, W. (eds) Guide to e-Science. Computer Communications and Networks. Springer, London. https://doi.org/10.1007/978-0-85729-439-5_6

Download citation

DOI: https://doi.org/10.1007/978-0-85729-439-5_6
Published: 13 April 2011
Publisher Name: Springer, London
Print ISBN: 978-0-85729-438-8
Online ISBN: 978-0-85729-439-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics