DNA Sequence Databases

Edwards, David; Hansen, David; Stajich, Jason E.

doi:10.1007/978-0-387-92738-1_1

David Edwards⁴,
David Hansen &
Jason E. Stajich

3834 Accesses
3 Citations

Abstract

The ability to sequence the DNA of an organism has become one of the most important tools in modern biological research. Beginning as a manual process, where DNA was sequenced a few tens or hundreds of nucleotides at a time, DNA sequencing is now performed by high throughput sequencing machines, with billions of bases of DNA being sequenced daily around the world. The recent development of “next generation” sequencing technology increases the throughput of sequence production many fold and reduces costs by orders of magnitude. This will eventually enable the sequencing of the whole genome of an individual for under 1,000 dollars. However, mechanisms for sharing and analysing this data, and for the efficient storage of the data, will become more critical as the amount of data being collected grows. Most importantly for biologists around the world, the analysis of this data will depend on the quality of the sequence data and annotations which are maintained in the public databases.

In this chapter we will give an overview of sequencing technology as it has changed over time, including some of the new technologies that will enable the sequencing of personal genomes. We then discuss the public DNA databases which collect, check, and publish DNA sequences from around the world. Finally we describe how to access this data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Hardcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Adams MD, Kelley JM, Gocayne JD, Dubnick M, Polymeropoulos MH, Xiao H et al (1991) Complementary DNA sequencing: expressed sequence tags and human genome project. Science 252:1651–1656
Article CAS PubMed Google Scholar
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215:403–410
CAS PubMed Google Scholar
Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL (2006) Genbank. Nucl Acids Res 34:D16–D20
Article CAS PubMed Google Scholar
Cochrane G, Aldebert P, Althorpe N, Andersson M, Baker W, Baldwin A et al (2006) EMBL nucleotide sequence database: developments in 2005. Nucl Acids Res 34:D10–D15
Article CAS PubMed Google Scholar
Drysdale R (2008) FlyBase – A database for the Drosophila research community. Methods Mol Biol 420:45–59
Article CAS PubMed Google Scholar
Federhen, S. (2003) The taxonomy project, in The NCBI Handbook. National Center for Biotechnology Information.
Google Scholar
Hong EL, Balakrishnan R, Dong Q, Christie KR, Park J, Binkley G et al (2008) Gene Ontology annotations at SGD: new data sources and annotation methods. Nucleic Acids Res 36:D577–D581
Article CAS PubMed Google Scholar
Kanz C, Aldebert P, Althorpe N, Baker W, Baldwin A, Bates K et al (2005) The EMBL Nucleotide Sequence Database. Nucleic Acids Res 33:D29–D33
Article CAS PubMed Google Scholar
Lee Y, Tsai J, Sunkara S, Karamycheva S, Pertea G, Sultana R et al (2005) The TIGR Gene Indices: clustering and assembling EST and known genes and integration with eukaryotic genomes. Nucl Acids Res 33:D71–D74
Article CAS PubMed Google Scholar
Leinonen R, Nardone F, Oyewole O, Redaschi N, Stoehr P (2003) The EMBL sequence version archive. Bioinformatics 19:1861–1862
Article CAS PubMed Google Scholar
Rogers A, Antoshechkin I, Bieri T, Blasiar D, Bastiani C, Canaran P et al (2008) WormBase. Nucleic Acids Res 36:D612–D617
Article CAS PubMed Google Scholar
Sanger F, Nicklen S, Coulson AR (1977) DNA sequencing with chain-terminating inhibitors. Proc Natl Acad Sci USA 74(12):5463–5467
Article CAS PubMed Google Scholar
Sugawara H, Ogasawara O, Okubo K, Gojobori T, Tateno Y (2008) DDBJ with new system and face. Nucleic Acids Res 36:D22–D24
Article CAS PubMed Google Scholar
Swarbreck S, Wilks C, Lamesch P, Berardini TZ, Garcia-Hernandez M, Foerster H et al (2008) The Arabidopsis Information Resource (TAIR): gene structure and function annotation. Nucleic Acids Res 36:D1009–D1014
Article CAS PubMed Google Scholar
The International HapMap Consortium (2007) A second generation human haplotype map of over 3.1 million SNPs. Nature 449:851–861
Article Google Scholar

Download references

Author information

Authors and Affiliations

Australian Centre for Plant Functional Genomics, Institute for Molecular Biosciences and School of Land, Crop and Food Sciences, University of Queensland, Brisbane, QLD, 4072, Australia
David Edwards

Authors

David Edwards
View author publications
You can also search for this author in PubMed Google Scholar
David Hansen
View author publications
You can also search for this author in PubMed Google Scholar
Jason E. Stajich
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to David Edwards .

Editor information

Editors and Affiliations

Inst. Molecular Bioscience, University of Queensland, St.Lucia, 4072, Australia
David Edwards
Dept. Plant & Microbial Biology, University of California, Berkeley, Koshland Hall 111, Berkeley, 94720, U.S.A.
Jason Stajich
e-Health Research Centre, Adelaide St. 300, Brisbane, 4000, Australia
David Hansen

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Edwards, D., Hansen, D., Stajich, J.E. (2009). DNA Sequence Databases. In: Edwards, D., Stajich, J., Hansen, D. (eds) Bioinformatics. Springer, New York, NY. https://doi.org/10.1007/978-0-387-92738-1_1

Download citation

DOI: https://doi.org/10.1007/978-0-387-92738-1_1
Published: 05 August 2009
Publisher Name: Springer, New York, NY
Print ISBN: 978-0-387-92737-4
Online ISBN: 978-0-387-92738-1
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)

Publish with us

Policies and ethics