Abstract
The way for performing multiple sequence alignment is based on the criterion of the maximum scored information content computed from a weight matrix, but it is possible to have two or more alignments to have the same highest score leading to ambiguities in selecting the best alignment. This paper addresses this issue by introducing the concept of joint weight matrix to eliminate the randomness in selecting the best alignment of multiple sequences. Alignments with equal scores are iteratively re-scored with joint weight matrix of increasing level (nucleotide pairs, triplets and so on) until one single best alignment is eventually found. This method can be easily implemented to algorithms using weight matrix for scoring such as those based on the widely used Gibbs sampling method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Stormo, G.D., Hartzell, G.W.: Identifying Protein-Binding Sites from Unaligned DNA Fragments. Proceedings of the National Academy of Sciences of the United States of America 86(4), 1183–1187 (1989)
Hu, J.J., Li, B., Kihara, D.: Limitations and Potentials of Current Motif Discovery Algorithms. Nucleic Acids Research 33(15), 4899–4913 (2005)
Thijs, G., Marchal, K., Lescot, M., Rombauts, S., De Moor, B., Rouze, P., Moreau, Y.: A Gibbs Sampling Method to Detect Overrepresented Motifs in the Upstream Regions of Coexpressed Genes. Journal of Computational Biology 9(2), 447–464 (2002)
Lawrence, C.E., Altschul, S.F., Boguski, M.S., Liu, J.S., Neuwald, A.F., Wootton, J.C.: Detecting Subtle Sequence Signals: A Gibbs Sampling Strategy for Multiple Alignment. Science 262(5131), 208–214 (1993)
Liu, Y.Y., Liu, X.S., Wei, L.P., Altman, R.B., Batzoglou, S.: Eukaryotic Regulatory Element Conservation Analysis and Identification Using Comparative Genomics. Genome Research 14(3), 451–458 (2004)
Shu, J.-J., Ouw, L.S.: Pairwise Alignment of the DNA Sequence Using Hypercomplex Number Representation. Bulletin of Mathematical Biology 66(5), 1423–1438 (2004)
Favorov, A.V., Gelfand, M.S., Gerasimova, A.V., Ravcheev, D.A., Mironov, A.A., Makeev, V.J.: A Gibbs Sampler for Identification of Symmetrically Structured, Spaced DNA Motifs with Improved Estimation of the Signal Length. Bioinformatics 21(10), 2240–2245 (2005)
Kuo, L., Yang, T.Y.: An Improved Collapsed Gibbs Sampler for Dirichlet Process Mixing Models. Computational Statistics & Data Analysis 50(3), 659–674 (2006)
Cattani, C.: Fractals and Hidden Symmetries in DNA. Mathematical Problems in Engineering 507056, 1–31 (2010)
Li, M.: Fractal Time Series-A Tutorial Review. Mathematical Problems in Engineering, 157264, 1–26 (2010)
Shu, J.-J., Li, Y.: Hypercomplex Cross-correlation of DNA Sequences. Journal of Biological Systems 18(4), 711–725 (2010)
Schneider, T.D., Mastronarde, D.N.: Fast Multiple Alignment of Ungapped DNA Sequences Using Information Theory and a Relaxation Method. Discrete Applied Mathematics 71(1-3), 259–268 (1996)
Benos, P.V., Bulyk, M.L., Stormo, G.D.: Additivity in Protein-DNA Interactions: How Good an Approximation is It? Nucleic Acids Research 30(20), 4442–4451 (2002)
Eden, E., Brunak, S.: Analysis and Recognition of 5’ UTR Intron Splice Sites in Human Pre-mRNA. Nucleic Acids Research 32(3), 1131–1142 (2004)
Osada, R., Zaslavsky, E., Singh, M.: Comparative Analysis of Methods for Representing and Searching for Transcription Factor Binding Sites. Bioinformatics 20(18), 3516–3525 (2004)
Zhou, Q., Liu, J.S.: Modeling within-Motif Dependency for Transcription Factor Binding Site Predictions. Bioinformatics 20(6), 909–916 (2004)
Schneider, T.D., Stormo, G.D., Gold, L., Ehrenfeucht, A.: Information Content of Binding Sites on Nucleotide Sequences. Journal of Molecular Biology 188(3), 415–431 (1986)
Schneider, T.D., Stephens, R.M.: Sequence Logos: A New Way to Display Consensus Sequences. Nucleic Acids Research 18(20), 6097–6100 (1990)
Shu, J.-J., Wang, Q.-W., Yong, K.-Y.: DNA-Based Computing of Strategic Assignment Problems. Physical Review Letters 106(18), 1–4 (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Shu, JJ., Yong, K.Y., Chan, W.K. (2011). Lecture Notes in Computer Science: Multiple DNA Sequence Alignment Using Joint Weight Matrix. In: Murgante, B., Gervasi, O., Iglesias, A., Taniar, D., Apduhan, B.O. (eds) Computational Science and Its Applications - ICCSA 2011. ICCSA 2011. Lecture Notes in Computer Science, vol 6784. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21931-3_51
Download citation
DOI: https://doi.org/10.1007/978-3-642-21931-3_51
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21930-6
Online ISBN: 978-3-642-21931-3
eBook Packages: Computer ScienceComputer Science (R0)