Abstract
How to discover evolutionary homology of nucleotide and amino acid sequences and how to analyze these homologous sequences are discussed, including homology search, pairwise alignment, multiple alignment, and genome-wide sequence viewing.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Altschul, S. F., Gish, W., Miller, W., Myers, E. W., & Lipman, D. J. (1990). Basic local alignment search tool. Journal of Molecular Biology, 215, 403–410.
Karlin, S., & Altschul, S. F. (1990). Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes. Proceedings of the National Academy of Sciences, 87, 2264–2268.
Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J., Zhang, Z., Miller, W., & Lipman, D. J. (1997). Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Research, 25, 3389–3402.
Zhang, Z., Schwartz, S., Wagner, L., & Miller, W. (2000). A greedy algorithm for aligning DNA sequences. Journal of Computational Biology, 7, 203–214.
Kitano, T., & Saitou, N. (2000). Evolutionary history of the Rh blood group-related genes in vertebrates. Immunogenetics, 51, 856–862.
Lipman, D. J., & Pearson, W. R. (1985). Rapid and sensitive protein similarity searches. Science, 227, 1435–1441.
Pearson, W. R., & Lipman, D. J. (1988). Improved tools for biological sequence comparison. Proceedings of the National Academy of Sciences of the United States, 85, 2444–2448.
Kent, W. J. (2002). BLAT – the BLAST-like alignment tool. Genome Research, 12, 656–664.
Ma, B., Tromp, J., & Li, M. (2002). PatternHunter: Faster and more sensitive homology search. Bioinformatics, 18, 440–445.
Eddy, S. R. (2009). A new generation of homology search tools based on probabilistic inference. Genome Informatics, 23, 205–211.
Fin, R. D., Clements, J., & Eddy, S. R. (2011). HMMER web server: Interactive sequence similarity searching. Nucleic Acids Research, 39, W29–W37.
Higgs, P. G., & Atwood, T. K. (2005). Bioinformatics and molecular evolution. Malden: Blackwell.
Chao, K.-M., & Zhang, L. (2008). Sequence comparison: Theory and methods (Computational biology series). London: Springer.
Saitou, N., & Ueda, S. (1994). Evolutionary rate of insertions and deletions in non-coding nucleotide sequences of primates. Molecular Biology and Evolution, 11, 504–512.
Needleman, S. B., & Wunsch, C. D. (1970). A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology, 48, 443–453.
Sellers, P. H. (1974). On the theory and computation of evolutionary distances. SIAM Journal on Applied Mathematics, 26, 787–793.
Waterman, M. S., Smith, T. F., & Beyer, W. A. (1976). Some biological sequence metrics. Advances in Mathematics, 20, 367–387.
Gotoh, O. (1982). An improved algorithm for matching biological sequences. Journal of Molecular Biology, 162, 705–708.
Altschul, S. F., & Erickson, B. W. (1986). A nonlinear measure of subalignment similarity and its significance levels. Bulletin of Mathematical Biology, 48, 603–616.
Fitch, W. (1969). Locating gaps in amino acid sequences to optimize the homology between two proteins. Biochemical Genetics, 3, 99–108.
Schulz, J., Florian Leese, F., & Held, C. (2011). Introduction to dot-plots. Web page available at http://www.code10.info/
Kuroki, Y., Toyoda, A., Noguchi, H., Taylor, T. D., Itoh, T., Kim, D. S., Kim, D. W., Choi, S. H., Kim, I. C., Choi, H. H., Kim, Y. S., Satta, Y., Saitou, N., Yamada, T., Morishita, S., Hattori, M., Sakaki, Y., Park, H. S., & Fujiyama, A. (2006). Comparative analysis of chimpanzee and human Y chromosomes unveils complex evolutionary pathway. Nature Genetics, 38, 158–167.
Murata, M., Richardson, J. S., & Sussman, J. L. (1985). Simultaneous comparison of three protein sequences. Proceedings of National Academy of Sciences, USA, 82, 3073–3077.
Feng, D.-F., & Doolittle, R. F. (1987). Progressive sequence alignment as a prerequisite to correct phylogenetic trees. Journal of Molecular Evolution, 25, 351–360.
Thompson, J. D., Higgins, D. G., & Gibson, T. J. (1994). CLUSTAL W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Research, 22, 4673–4680.
Katoh, K., Misawa, K., Kuma, K., & Miyata, T. (2002). MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Research, 30, 3059–3066.
Notredame, C. (2007). Recent evolutions of multiple sequence alignment algorithms. PLoS Computational Biology, 3, e123.
Morgenstern, B., Dress, A., & Werner, T. (1996). Multiple DNA and protein sequence alignment based on segment-to-segment comparison. Proceedings of National Academy of Sciences, USA, 93, 12098–12103.
Brudno, M., Do, C., Cooper, G., Kim, M. F., Davydov, E., Green, E. D., Sidow, A., & Batzoglou, S. (2003). LAGAN and multi-LAGAN: Efficient tools for large-scale multiple alignment of genomic DNA. Genome Research, 13, 721–731.
Edgar, R. C. (2004). MUSCLE: Multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research, 32, 1792–1797.
Bray, N., & Pachter, L. (2004). MAVID: Constrained ancestral alignment of multiple sequences. Genome Research, 14, 693–699.
Darling, A. C. E., Mau, B., Blatter, F. R., & Perna, N. T. (2004). Mauve: Multiple alignment of conserved genomic sequence with rearrangements. Genome Research, 14, 1394–1403.
Darling, A. C. E., Mau, B., & Perna, N. T. (2010). progressiveMauve: Multiple genome alignment with gene gain, loss and rearrangement. PLoS ONE, 5, e11147.
Kryukov, K., & Saitou, N. (2010). MISHIMA – A new method for high speed multiple alignment of nucleotide sequences of bacterial genome scale data. BMC Bioinformatics, 11, 142.
Popendorf, K., Tsuyoshi, H., Osana, Y., & Sakakibara, Y. (2010). Murasaki: A fast, parallelizable algorithm to find anchors from multiple genomes. PLoS ONE, 5, e12651.
Higgins, D. G., & Sharp, P. (1988). CLUSTAL: A package for performing multiple sequence alignment on a microcomputer. Gene, 73, 237–244.
Saitou, N., & Nei, M. (1987). The neighbor-joining method: A new method for reconstructing phylogenetic trees. Molecular Biology and Evolution, 4, 406–425.
Kimura, M. (1980). A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. Journal of Molecular Evolution, 16, 111–120.
Kimura, M. (1983). The neutral theory of molecular evolution. Cambridge: Cambridge University Press.
Higgins, D. G., Bleasby, A. J., & Fuchs, R. (1992). CLUSTAL V: Improved software for multiple sequence alignment. Computational Applied Biosciences, 8, 189–191.
Wilbur, W. J., & Lipman, D. (1984). The context dependent comparison of biological sequences. SIAM Journal of Applied Mathematics, 44, 557–567.
Myers, E. W., & Miller, W. (1988). Optimal alignments in linear space. CABIOS, 4, 11–17.
Larkin, M. A., Blackshields, G., Brown, N. P., et al. (13 co-authors) (2007) Clustal W and Clustal X version 2.0. Bioinformatics, 23, 2947–2948.
Sievers, F., Wilm, A., Dineen, D., Gibson, T. J., Karplus, K., Li, W., Lopez, R., McWilliam, H., Remmert, M., Söding, J., Thompson, J. D., & Higgins, D. G. (2011). Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Molecular Systems Biology, 7, 539.
Felsenstein, J., Sawyer, S., & Kochin, R. (1982). An efficient method for matching nucleotide acid sequences. Nucleic Acids Research, 10, 133–139.
Notredame, C., Higgins, D. G., & Heringa, J. (2000). T-Coffee: A novel method for fast and accurate multiple sequence alignment. Journal of Molecular Biology, 302, 205–217.
Galtier, N., Gouy, M., & Gautier, C. (1996). SEA VIEW and PHYLO_WIN: Two graphic tools for sequence alignment and molecular phylogeny. Computer Applications in the Biosciences, 12, 543–548.
Lipman, D. J., Altschul, S. F., & Kececioglu, J. D. (1989). A tool for multiple sequence alignment. Proceedings of the National Academy of Sciences of the United States of America, 86, 4412–4415.
Subramanian, A. R., Kaufmann, M., & Morgenstern, B. (2008). DIALIGN-TX: Greedy and progressive approaches for segment-based multiple sequence alignment. Algorithms for Molecular Biology, 3, 6.
Bradley, R. K., Roberts, A., Smoot, M., Juvekar, S., Do, J., Dewey, C., Holmes, I., & Pachter, L. (2009). Fast statistical alignment. PLoS Computational Biology, 5, e1000392.
Bray, N., Dubchak, I., & Pachter, L. (2003). AVID: A global alignment program. Genome Research, 13, 97–102.
Blanchette, M., Kent, W. J., Riemer, C., Elnitski, L., Smit, A. F. A., Roskin, K. M., Baertsch, R., Rosenbloom, K., Clawson, H., Green, E. D., Haussler, D., & Miller, W. (2004). Aligning multiple genomic sequences with the threaded blockset aligner. Genome Research, 14, 708–715.
Brudno, M., Chapman, M., Gottgens, B., Batzoglou, S., & Morgenstern, B. (2003). Fast and sensitive multiple alignment of long genomic sequences. BMC Bioinformatics, 4, 66.
Raphael, B., Zhi, D., Tang, H., & Pevzner, P. (2004). A novel method for multiple alignment of sequences with repeated and shuffled elements. Genome Research, 14, 2336–2346.
Do, C. B., Mahabhashyam, M. S. P., Brudno, M., & Batzoglou, S. (2005). ProbCons: Probabilistic consistency-based multiple sequence alignment. Genome Research, 15, 330–340.
Lassmann, T., & Sonnhammer, E. L. L. (2005). Kalign—An accurate and fast multiple sequence alignment algorithm. BMC Bioinformatics, 6, 298.
Lotynoja, A., & Goldman, N. (2005). An algorithm for progressive multiple alignment of sequences with insertions. Proceedings of the National Academy of Sciences of the United States of America, 102, 10557–10562.
Sze, S.-H., Lu, Y., & Yang, Q. (2006). A polynomial time solvable formulation of multiple sequence alignment. Journal of Computational Biology, 13, 309–319.
Liu, Y., Schmidt, B., & Maskell, D. L. (2010). MSAProbs: Multiple sequence alignment based on pair hidden Markov models and partition function posterior probabilities. Bioinformatics, 26, 1958–1964.
Shih, A. C.-C., & Li, W.-H. (2003). GS-Aligner: A novel tool for aligning genomic sequences using bit-level operations. Molecular Biology and Evolution, 20, 1299–1309.
Keightley, P. D., & Johnson, T. (2004). MCALIGN: Stochastic alignment of noncoding DNA sequences based on an evolutionary model of sequence evolution. Genome Research, 14, 442–450.
Kurtz, S., Phillippy, A., Delcher, A. L., Smoot, M., Shumway, M., Antonescu, C., & Salzberg, S. L. (2004). Versatile and open software for comparing large genomes. Genome Biology, 5, R12.
Schwartz, S., Zhang, Z., Frazer, K. A., Smit, A., Riemer, C., Bouck, J., Gibbs, R., Hardison, R., & Miller, W. (2000). PipMaker–A web server for aligning two genomic DNA sequences. Genome Research, 10, 577–586.
Matsunami, M., Sumiyama, K., & Saitou, N. (2010). Evolution of conserved non-coding sequences within the vertebrate Hox clusters through the two-round whole genome duplications revealed by phylogenetic footprinting analysis. Journal of Molecular Evolution, 71, 427–436.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag London
About this chapter
Cite this chapter
Saitou, N. (2013). Sequence Homology Handling. In: Introduction to Evolutionary Genomics. Computational Biology, vol 17. Springer, London. https://doi.org/10.1007/978-1-4471-5304-7_14
Download citation
DOI: https://doi.org/10.1007/978-1-4471-5304-7_14
Published:
Publisher Name: Springer, London
Print ISBN: 978-1-4471-5303-0
Online ISBN: 978-1-4471-5304-7
eBook Packages: Computer ScienceComputer Science (R0)