Scan Statistics in DNA and Protein Sequence Analysis

  • Joseph Glaz
  • Joseph Naus
  • Sylvan Wallenstein
Part of the Springer Series in Statistics book series (SSS)

Abstract

Scientists in fields ranging from evolution to medicine compare protein or DNA sequences from several biological sources. DNA is a long molecule, deoxyribonucleic acid, that contains genetic codes that control biological processes. The DNA molecule most often consists of two strands of nucleotides each consisting of a deoxyribose residue, a phosphate group, and a nucleotide base. The four nucleotide bases (or bases for short) are denoted A, C, G, T corresponding to adenine, cytosine, guanine, and thymine. The deoxyribose residues linked by phosphate bonds are like the backbone of a single strand of a long necklace with the bases being attached beads. The two strands are linked by hydrogen bonds between pairs of bases, where an A on one strand links with T on the other strand, and a C on one strand links with a G on the other strand. Knowing the sequence of bases in one strand automatically gives the sequence in the complementary strand.

Keywords

Perfect Match Common Word Independent Sequence Protein Sequence Analysis Poisson Approximation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer Science+Business Media New York 2001

Authors and Affiliations

  • Joseph Glaz
    • 1
  • Joseph Naus
    • 2
  • Sylvan Wallenstein
    • 3
  1. 1.Department of Statistics The College of Liberal Arts and SciencesUniversity of ConnecticutStorrsUSA
  2. 2.Department of Statistics RutgersThe State University of New JerseyPiscatawayUSA
  3. 3.Department of Biomathematical SciencesMount Sinai School of MedicineNew YorkUSA

Personalised recommendations