Large-Scale Sequencing and Analytical Processing of ESTs

  • Makedonka Mitreva
  • Elaine R. Mardis
Part of the Methods in Molecular Biology book series (MIMB, volume 533)


Expressed sequence tags (ESTs) have proven to be one of the most rapid and cost-effective routes to gene discovery for eukaryotic genomes. Furthermore, their multipurpose uses, such as in probe design for microarrays, determining alternative splicing, verifying open reading frames, and confirming exon/intron and gene boundaries, to name a few, further justify their inclusion in many genomic characterization projects. Hence, there has been a constant increase in the number of ESTs deposited into the dbEST division of GenBank. This trend also correlates to ever-improving molecular techniques for obtaining biological material, performing RNA extraction, and constructing cDNA libraries, and predominantly to ever-evolving sequencing chemistry and instrumentation, as well as to decreased sequencing costs. This chapter describes large-scale sequencing of ESTs on two distinct platforms: the ABI 3730xl and the 454 Life Sciences GS20 sequencers, and the corresponding processes of sequence extraction, processing, and submissions to public databases. While the conventional 3730xl sequencing process is described, starting with the plating of an already-existing cDNA library, the section on 454 GS20 pyrosequencing also includes a method for generating full-length cDNA sequences. With appropriate bioinformatics tools, each of these platforms either used independently or coupled together provide a powerful combination for comprehensive exploration of an organism’s transcriptome.

Key words

Expressed sequence tags EST sequencing cDNA capillary sequencing ABI 3730xl pyrosequencing GS20 



The authors would like to thank the Genome Sequencing Center’s Sequence Production and Technology Development Groups. The authors are supported by grants from the National Human Genome Research Institute U54 HG003079 and the National Institute of Allergy and Infectious Disease AI 46593.


  1. 1.
    Maxam, A. M., and Gilbert, W. (1977) A new method for sequencing DNA. Proc Natl Acad Sci USA 74, 560–64.PubMedCrossRefGoogle Scholar
  2. 2.
    Maxam, A. M., and Gilbert, W. (1980) Sequencing end-labeled DNA with base-specific chemical cleavages. Methods Enzymol 65, 499–560.PubMedCrossRefGoogle Scholar
  3. 3.
    Sanger, F., Niklen, S., and Coulson, A. (1977) DNA sequencing with chain-terminating inhibitors. Proc Natl Acad Sci USA 74, 5463–67.PubMedCrossRefGoogle Scholar
  4. 4.
    Margulies, M., Egholm, M., Altman, W. E., Attiya, S., Bader, J. S., Bemben, L. A., Berka, J., Braverman, M. S., Chen, Y.-J., Chen, Z., Dewell, S. B., Du, L., Fierro, J. M., Gomes, X. V., Godwin, B. C., He, W., Helgesen, S., Ho, C. H., Irzyk, G. P., Jando, S. C., Alenquer, M. L. I., Jarvie, T. P., Jirage, K. B., Kim, J.-B., Knight, J. R., Lanza, J. R., Leamon, J. H., Lefkowitz, S. M., Lei, M., Li, J., Lohman, K. L., Lu, H., Makhijani, V. B., McDade, K. E., McKenna, M. P., Myers, E. W., Nickerson, E., Nobile, J. R., Plant, R., Puc, B. P., Ronan, M. T., Roth, G. T., Sarkis, G. J., Simons, J. F., Simpson, J. W., Srinivasan, M., Tartaro, K. R., Tomasz, A., Vogt, K. A., Volkmer, G. A., Wang, S. H., Wang, Y., Weiner, M. P., Yu, P., Begley, R. F., and Rothberg, J. M. (2005) Genome sequencing in microfabricated high-density picolitre reactors. Nature 437, 376–80.PubMedGoogle Scholar
  5. 5.
    Birnboim, H. C., and Doly, J. (1979) A rapid alkaline extraction procedure for screening recombinant plasmid DNA. Nucl Acids Res 7, 1513–23.PubMedCrossRefGoogle Scholar
  6. 6.
    Ewing, B., Hillier, L., Wendl, M. C., and Green, P. (1998) Base-calling of automated sequencer traces using phred.  I. Accuracy assessment. Genome Res 8, 175–85.PubMedGoogle Scholar
  7. 7.
    Ewing, B., and Green, P. (1998) Base-calling of automated sequencer traces using phred.  II. Error probabilities. Genome Res 8, 186–94.PubMedGoogle Scholar
  8. 8.
    Ewing, B., and Green, P. (1998) Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res 8, 186–94.PubMedGoogle Scholar
  9. 9.
    Ewing, B., Hillier, L., Wendl, M. C., and Green, P. (1998) Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res 8, 175–85.PubMedGoogle Scholar
  10. 10.
    Wootton, J. C., and Federhen, S. (1996) Analysis of compositionally biased regions in sequence databases. Methods Enzymol 266, 554–71.PubMedCrossRefGoogle Scholar

Copyright information

© Humana Press, a part of Springer Science+Business Media, LLC 2009

Authors and Affiliations

  • Makedonka Mitreva
    • 1
  • Elaine R. Mardis
    • 1
  1. 1.Genome Sequencing Center, Department of GeneticsWashington University School of MedicineSt. LouisUSA

Personalised recommendations