Skip to main content

Asking Complex Questions of the Genome Without Programming

  • Protocol
  • First Online:
Genetic Variation

Part of the book series: Methods in Molecular Biology ((MIMB,volume 628))

Abstract

Increasingly, vast amounts of genomics and genetic data are available. Although much of the data is largely accessible to relatively simple web queries, in some cases, more complex queries are required. This paper reviews the hierarchy of tools for querying genetic and genomic data. For querying multiple genes, variants or regions ENSEMBL BioMart and the UCSC Table Browser offer flexible interfaces. For more complex queries, GALAXY is a sophisticated tool for building workflows over existing internet resources. For the most challenging genome scale queries, programmatic access may be required through a defined application programming interface (API) - such as the one provided by Ensembl. All these tools allow one to rapidly ask many questions that were difficult to answer a few years ago, but choosing the appropriate tool for the job is critical.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Stein LD (2008) Towards a cyberinfrastructure for the biological sciences: progress, visions and challenges. Nat Genet. 9(9):678–688.

    CAS  Google Scholar 

  2. Smith B, Ashburner M, Rosse C, et al. (2007) The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration, 1. Nat Biotechnol. 25(11):1251–1255.

    Article  PubMed  CAS  Google Scholar 

  3. Kasprzyk A, Keefe D, Smedley D, et al. (2004) EnsMart: A generic system for fast and flexible access to biological data. Genome Res. 14:160–169.

    Article  PubMed  CAS  Google Scholar 

  4. Karolchik D, Kuhn, RM, Baertsch R, et al. (2008) The UCSC Genome Browser Database: 2008 update. Nucleic Acids Res. 36:D773-D779.

    Article  PubMed  CAS  Google Scholar 

  5. http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html.

  6. Durinck S, Moreau Y, Kasprzyk A, et al. (2005) BioMart and Bioconductor: a powerful link between biological databases and microarray data analysis. Bioinformatics 21(16):3439–3440.

    Article  PubMed  CAS  Google Scholar 

  7. Giardine B, Riemer C, Hardison RC, et al. (2005) Galaxy: a platform for interactive large-scale genome analysis. Genome Res. 15(10):1451–1455.

    Article  PubMed  CAS  Google Scholar 

  8. Harrow J, Denoeud F, Frankish A, et al. (2006) GENCODE: producing a reference annotation for ENCODE. Genome Biol. Suppl. 1:S4.1-S4.9.

    Google Scholar 

  9. http://en.wikipedia.org/wiki/Bioinformatics_workflow_management_systems.

  10. Oinn T, Addis M, Ferris J, et al. (2004) Taverna: a tool for the composition and enactment of bioinformatics workflows. Bioinformatics 20(17):3045–3054.

    Article  PubMed  CAS  Google Scholar 

  11. Inforsense http://www.inforsense.com/.

  12. Accelrys SciTegic Pipeline Pilot http://accelrys.com/products/scitegic/.

  13. Birney E, Andrews TD, Bevan P, et al. (2004) An overview of Ensembl. Genome Res. 14(5):925–928.

    Article  PubMed  CAS  Google Scholar 

  14. Wheeler DL, Barrett T, Benson DA, et al. (2008) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 36(Database issue):D13-D21.

    Article  PubMed  CAS  Google Scholar 

  15. Stabenau A, McVicker G, Melsopp C, et al. (2004) The Ensembl core software libraries. Genome Res. 14(5):929–933.

    Article  PubMed  CAS  Google Scholar 

  16. Cohen KB, Hunter L (2008) Getting started in text mining. PLoS Comput Biol. 4(1):e20.

    Article  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peter M. Woollard .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Woollard, P.M. (2010). Asking Complex Questions of the Genome Without Programming. In: Barnes, M., Breen, G. (eds) Genetic Variation. Methods in Molecular Biology, vol 628. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-60327-367-1_3

Download citation

  • DOI: https://doi.org/10.1007/978-1-60327-367-1_3

  • Published:

  • Publisher Name: Humana Press, Totowa, NJ

  • Print ISBN: 978-1-60327-366-4

  • Online ISBN: 978-1-60327-367-1

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics