Abstract
The problem of aligning two DNA sequences with respect to the fact that they are coding for proteins is discussed. Criteria for a good alignment of coding DNA, together with an algorithm that satisfies them, are presented. The algorithm is robust against frame-shifts and forgiving towards silent substitutions. The important choice of objective function is examined and several variants are proposed.
Preview
Unable to display preview. Download preview PDF.
References
K.-M. Chao. Computing all suboptimal alignments in linear space. In 5th Symposium on Combinatorial Pattern Matching, pages 31–42. Springer-Verlag LNCS 807, 1994.
M. O. Dayhoff, R. M. Schwartz, and B. C. Orcott. A model of evolutionary change in proteins. Atlas of Protein Sequence and Structure, 5:345–352, 1978. National Biomedical Research Foundation, Silver Spring, Maryland, USA.
O. Gotoh. An improved algorithm for matching biological sequences. Journal of Molecular Biology, 162:705–708, 1982.
X. Guan and E. C. Uberbacher. Alignments of DNA and protein sequences containing frameshift errors. Comp. Appl. Bio. Sci., 12(1):31–40, 1996.
J. Hein. An algorithm combining DNA and protein alignment. Journal of Theoretical Biology, 167:169–174, 1994.
S. Henikoff and J. G. Henikoff. Amino acid substitution matrices from protein blocks. Proc. Natl. Acad.Sci., 89:10915–10919, 1992.
D. S. Hirschberg. A linear space algorithm for computing longest common subsequences. Communications of the ACM, 18:341–343, 1975.
L. J. Knecht. Alignment and Analysis of Genes Coding for Proteins. PhD thesis, Swiss Federal Institute of Technology, 1996.
T. Leitner. Personal communication. Until recently at the Department of Biochemistry, Royal Institute of Technology, Stockholm, now at Los Alamos National Laboratory, USA, Theoretical Biology and Biophysics Group.
E. W. Myers and W. Miller. Optimal alignments in linear space. Comp. Appl. Bio. Sci., 4(1):11–17, 1988.
H. Peltola, H. Söderlund, and E. Ukkonen. Algorithms for the search of amino acid patterns in nucleic acid sequences. Nuclear Acids Research, 14(1):99–107, 1986.
D. Sankoff and J. Kruskal. Time warps, string edits, and macromolecules: The theory and practice of sequence comparison. Addison-Wesley, 1983.
P. H. Sellers. On the theory and computation of evolutionary distances. SIAM Journal on Applied Mathematics, 26:787, 1974.
D. J. States and D. Botstein. Molecular sequence accuracy and the analysis of protein coding regions. Proc. Natl. Acad.Sci., 88:5518–5522, July 1991.
M. S. Waterman. Introduction to computational biology: Maps, sequences and genomes. Chapman & Hall, 1995.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Arvestad, L. (1997). Aligning coding DNA in the presence of frame-shift errors. In: Apostolico, A., Hein, J. (eds) Combinatorial Pattern Matching. CPM 1997. Lecture Notes in Computer Science, vol 1264. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-63220-4_59
Download citation
DOI: https://doi.org/10.1007/3-540-63220-4_59
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63220-7
Online ISBN: 978-3-540-69214-0
eBook Packages: Springer Book Archive