Skip to main content

Accurate Restoration of DNA Sequences

  • Conference paper

Part of the book series: Lecture Notes in Statistics ((LNS,volume 105))

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Alizadeh, F., Karp, R.M., Newberg, L.A., Weisser, D.K. (1992) Physical mapping of chromosomes: A combinatorial problem in molecular biology. Preprint.

    Google Scholar 

  • Altschul, S.F., Lipman, D.J. (1989) Trees, stars, and multiple biological sequence alignment. SIAM Journal on Applied Mathematics 49:197–209.

    Article  MathSciNet  MATH  Google Scholar 

  • Berger, J.O. (1985) Statistical Decision Theory and Bayesian Analysis. 2nd ed. Springer-Verlag.

    MATH  Google Scholar 

  • Borodovsky, M. and McIninch, J. (1993a) Genmark: Parallel gene recognition for both DNA strands. Computers Chem. 17:123–133.

    Article  MATH  Google Scholar 

  • Borodovsky, M. and McIninch, J. (1993b) Eecognition of genes in DNA sequence with ambiguity. Biosystems 30:161–171.

    Article  Google Scholar 

  • Bowling, J.M., Bruner, K.L., Cmarik, J.L., Tibbets, C. (1991) Neighboring nucleotide interactions during DNA sequencing gel electrophoresis. Nucl. Acids Res. 19:3089–3097.

    Article  Google Scholar 

  • Branscomb, E. et al. (1990) Optimizing restriction fragment fingerprinting methods for ordering large genomic libraries. Genomics 8:351–366.

    Article  Google Scholar 

  • Casella, G.C. and George, E.I. (1992) Explaining the Gibbs sampler American Statistician.

    Google Scholar 

  • Chen, E. et al. (1991) Sequence of the human glucose-6-phosphate dehydrogenase cloned in plasmids and a yeast artificial chromosome. Genomics 10:792–800.

    Article  Google Scholar 

  • Chernoff H. (1992) Estimating a sequence from noisy copies. Harvard University technical report no. ONR-C-10.

    Google Scholar 

  • Churchill, G.A. (1989) A stochastic model for heterogeneous DNA sequences. Bull. Math. Biol. 51:79–94.

    MathSciNet  MATH  Google Scholar 

  • Churchill, G.A., Burks, C., Eggert, M., Engle, M.L., Waterman, M.S. (1992) Assembling DNA fragments by shuffling and simulated annealing. Manuscript.

    Google Scholar 

  • Churchill, G.A. and Thorne, J.L. (1993) The probability distribution of a molecular sequence alignment. Cornell University, Biometrics Unit technical report.

    Google Scholar 

  • Churchill, G.A. and Waterman, M.S. (1992). The accuracy of DNA sequences: estimating sequence quality. Genomics in press.

    Google Scholar 

  • Clark, A.G. and Whittam T.S. (1992) Sequencing errors and molecular evolutionary analysis. Mol. Biol. Evol. 9:744–752.

    Google Scholar 

  • Clarke, L. and Carbon, J. (1976) A colony bank containing synthetic Col EI hybrid plasmids representative of the entire E. coli genome. Cell 9:91–99.

    Article  Google Scholar 

  • Cornish-Bowden A. (1985) Nomenclature for incompletely specified bases in DNA sequences: Recommendations 1984. Nucl. Acids Res. 13:3021–3030.

    Article  Google Scholar 

  • Daniels, D.L., Plunkett, G., Burland, V., Blattner, F.R. (1992) Analysis of the Escherichia coli genome: DNA sequence of the region from 84.5 to 86.5 minutes. Science 257: 771–778.

    Article  Google Scholar 

  • Dempster, A.P., Laird, N.M., Rubin, D.B. (1977) Maximum likelihood from incomplete data via the EM algorithm. J. Royal Stat. Soc. B 39:1–38.

    MathSciNet  MATH  Google Scholar 

  • Edwards, A. et al. (1990) Automated DNA sequencing of the Human HPRT locus. Genomics 6:593–608.

    Article  Google Scholar 

  • Fu, Y.X., Timberlake, W.E., Arnold, J. (1992) On the design of genome mapping experiments using short synthetic oligonucleotides. Biometrics 48:337–359.

    Article  Google Scholar 

  • Gelfand A.E. and Smith, A.F.M. (1990) Sampling based approaches to calculating marginal densities. J. Amer. Statist. Assoc. 85:398–409.

    Article  MathSciNet  MATH  Google Scholar 

  • Gelman, A. and Rubin, D.B. (1992) Inference from iterative simulation, with discussion. Statistical Science 7:457–511.

    Article  Google Scholar 

  • Geyer, C.J. (1992) Markov chain Monte Carlo maximum likelihood. Computer Science and Statistics: Proceeding of the 23rd symposium on the interface.

    Google Scholar 

  • Golden, J.B., Torgersen, D., Tibbets, C. (1993) Pattern recognition for automated DNA sequencing: I. On-line signal conditioning and feature extraction for basecalling. In Proceedings of the First International Conference on Intelligent Systems for Molecular Biology. AAAI Press.

    Google Scholar 

  • Hastings (1970) Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57:97–109.

    Article  Google Scholar 

  • Huang, X. (1992) A contig assembly program based on sensitive detection of fragment overlaps. Genomics 14:18–25.

    Article  Google Scholar 

  • Hunkapillar, T, Kaiser, R.J., Koop, B.F., Hood, L. (1991) Large-scale automated DNA sequence determination. Science 254:59–67.

    Article  Google Scholar 

  • Kececioglu, J. and Myers, E. (1990). A robust automatic fragment assembly system. Preprint.

    Google Scholar 

  • Koop, B.F., Rowan, L., Chen, W.-Q., Deshpande, P., Lee, H. and Hood, L. (1993) Sequence length and error analysis of sequenase and automated Taq cycle sequencing methods. Bio Techniques 14:442–447.

    Google Scholar 

  • Krawetz, S.A. (1989) Sequence errors described in GenBank: A means to determine the accuracy of DNA sequence interpretation. Nucl. Acids Res. 17:3951–3957.

    Article  Google Scholar 

  • Krogh, A., Brown, M., Mian, I.S., Sjölander, K., Haussler, D. (1993) Hidden Markov models in computational biology: Applications to protein modeling. J. Mol. Biol., accepted.

    Google Scholar 

  • Lander, E.S. and Waterman, M.S. (1988) Genomic mapping by fingerprinting random clones: A mathematical analysis. Genomics 2:231–239.

    Article  Google Scholar 

  • Larson, S., Mudita, J., Myers, G. (1993) An interface for a fragment assembly kernal. University of Arizona, Department of Computer Science TR93–20.

    Google Scholar 

  • Lawrence, C.B. and Solovyev, V.V. (1993) Assignment of position specific error probability to primary DNA sequence data, manuscript.

    Google Scholar 

  • Lewin, B. (1992) Genes V. Wiley, New York.

    Google Scholar 

  • Maxam, A.M. and Gilbert, W. (1977) A new method for sequencing DNA. Proc. Natl Acad. Sci. 74:5463–5467.

    Article  Google Scholar 

  • Oliver, S.G., et al. (1992) The complete DNA sequence of yeast chromosome III. Nature 357:38–46.

    Article  Google Scholar 

  • Posfai J. and Roberts, R.J. (1992) Finding errors in DNA sequences. Proc. Natl. Acad. Sci. 89: 4698–4702.

    Article  Google Scholar 

  • Roberts, L. (1990). Large-scale sequencing trials begin. Science, 250: 1336–1338.

    Article  Google Scholar 

  • Sanger, F., Nicklen, S., and Coulson, A.R. (1977) DNA sequencing with chain terminating inhibitors. Biochemistry 74:560–564.

    Google Scholar 

  • Santner, T.J. and Duffy, D.E. (1989) The Statistical Analysis of Discrete Data. Springer-Verlag, NY.

    Book  MATH  Google Scholar 

  • Seto, D., Koop, B.F., Hood, L. (1993) An experimentally derived data set constructed for testing large-scale DNA sequence assembly algorithms. Genomics 15:673–676.

    Article  Google Scholar 

  • Staden, R. (1980). A new computer method for the storage and manipulation of DNA gel reading data. Nucleic Acids Res. 8:3673–2694.

    Article  Google Scholar 

  • States, D.J. (1992) Molecular sequence accuracy: analysing imperfect data. Trends in Genetics 8:52–55.

    Google Scholar 

  • States, D.J. and Botstein, D. (1991). Molecular sequence accuracy and the analysis of protein coding regions. Proc. Natl. Acad. Sci. USA 88:5518–5522.

    Article  Google Scholar 

  • Sulston, J. et al. (1992) The C. elegans genome sequencing project: a beginning. Nature 356:37–41.

    Article  Google Scholar 

  • Thorne, J.L. and Churchill, G.A. (1993) Estimation and reliability of molecular sequence alignments. Biometrics, accepted.

    Google Scholar 

  • Thorne, J.L., Kishino, H., Felsenstein, J.F. (1991) An evolutionary model for maximum likelihood alignment of DNA sequences. J. Mol. Evol. 33:114–124.

    Article  Google Scholar 

  • Thorne, J.L., Kishino, H., Felsenstein, J.F. (1992) Inching toward reality: An improved likelihood model of sequence evolution. J. Mol. Evol. 34:3–16.

    Article  Google Scholar 

  • Tibbets, C, Bowling, J.M., Golden, J.B. (1993) Neural networks for automated base calling of gel-based DNA sequencing ladders. In Automated DNA Sequencing and Analysis Techniques Dr. J. Craig Ventner, Editor, Academic Press.

    Google Scholar 

  • Waterman, M.S. (1984) General methods of sequence comparison. Bull. Math. Biol. 46:473–500.

    MathSciNet  MATH  Google Scholar 

  • Watson, J and Crick, F. (1953) Nature 171: 737–738.

    Article  Google Scholar 

  • Besag, J. and Mengersen, K.L. (1993) Meta-Analysis using Monte Carlo Markov Chain methods. Tech. report, Dept. of Statistics, Colorado State Univ.

    Google Scholar 

  • Celeux, G. and Diebolt, J. (1986) The SEM algorithm: a probabilistic teacher algorithm derived from the EM algorithm for the mixture problem. Cornput. Statist. Quater. 2, 73–82.

    Google Scholar 

  • Diebolt, J. and Robert, C.P. (1993) The Duality Principle: Discussion of Smith and Roberts, Besag and Green, and Gllks et al. J.R.S.S. (Ser. B) 55, 73–74.

    Google Scholar 

  • Diebolt, J. and Robert, C.P. (1994) Estimation of finite mixture distributions by Bayesian sampling. J.R.S.S. (Ser. B) 56, 163–175.

    MathSciNet  Google Scholar 

  • Gelman, A. and Rubin, D.B. (1992) Does a single iteration suffice? In Bayesian Statistics 4 (J.O. Berger, J.M. Bernardo, A.P. Dawid and A.F.M. Smith, eds.) Oxford University Press, London.

    Google Scholar 

  • Karlin, S., Dembo, A., and Kawabata, T. (1990). Statistical composition of high-scoring segments from molecular sequences. Ann. Statist. 18 , 571–581.

    Article  MathSciNet  Google Scholar 

  • Lawrence, C.E., Atschul, S.F., Boguski, M.S., Liu, J.S., Neuwald, A.F. and Wootton, J.C. (1993) Detecting subtle sequence signals: A Gibbs sampling strategy for multiple alignment. Science 262, 208–214.

    Article  Google Scholar 

  • Muller, P. (1992) A black-box algorithm for implementing the Metropolis algorithm. Tech. Report, Dept. of Statistics, Purdue University, Lafayette.

    Google Scholar 

  • Qian, W. and Titterington, D.M. (1991) Estimation of parameters in hidden Markov models. Phil Trans. Roy. Soc. London A 337, 407–428.

    Article  MATH  Google Scholar 

  • Robert, C.P. (1992) Discussion of Meng and Rubin In Bayesian Statistics 4 (J.O. Berger, J.M. Bernardo, A.P. Dawid and A.F.M. Smith, eds.) Oxford University Press, London.

    Google Scholar 

  • Robert, C.P. (1993) Convergence assessments for Monte-Carlo Markov chain methods. Technical Report, Dept. of Math, Univ. de Rouen.

    Google Scholar 

  • Tierney, L. (1991) Markov chains for exploring posterior distributions. Computer Sciences and Statistics: Proc. 23d Symp. Interface, 563–570.

    Google Scholar 

  • Cleveland, W.S. (1979) Robust Locally-weighted Regression and Smoothing Scatterplots. J. Amer. Statist. Assoc. 74, 829–836.

    Article  MathSciNet  MATH  Google Scholar 

  • Koop, B.F., Rowan, L., Chen, W.-Q., Deshpande, P., Lee, BL and Hood, L. (1993). Sequence Length and Error Analysis of Sequenase and Automated Taq Cycle Sequencing Methods. Biotechniques 14, 442–447.

    Google Scholar 

  • Sanger, F., Nicklen, S. and Coulson, A.R. (1977). DNA Sequencing with Chain Terminating Inhibiters. Biochemistry 74, 560–564.

    Google Scholar 

  • Waterman, M.S. (1984). General Methods of Sequence Comparison. Bull Math. Biol. 46, 473–500.

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1995 Springer-Verlag New York, Inc.

About this paper

Cite this paper

Churchill, G.A. (1995). Accurate Restoration of DNA Sequences. In: Gatsonis, C., Hodges, J.S., Kass, R.E., Singpurwalla, N.D. (eds) Case Studies in Bayesian Statistics, Volume II. Lecture Notes in Statistics, vol 105. Springer, New York, NY. https://doi.org/10.1007/978-1-4612-2546-1_3

Download citation

  • DOI: https://doi.org/10.1007/978-1-4612-2546-1_3

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-0-387-94566-8

  • Online ISBN: 978-1-4612-2546-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics