Skip to main content

Sequence Homology Handling

  • Chapter
  • First Online:
Introduction to Evolutionary Genomics

Part of the book series: Computational Biology ((COBO,volume 17))

  • 3302 Accesses

Abstract

How to discover evolutionary homology of nucleotide and amino acid sequences and how to analyze these homologous sequences are discussed, including homology search, pairwise alignment, multiple alignment, and genome-wide sequence viewing.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Altschul, S. F., Gish, W., Miller, W., Myers, E. W., & Lipman, D. J. (1990). Basic local alignment search tool. Journal of Molecular Biology, 215, 403–410.

    Google Scholar 

  2. http://www.ncbi.nlm.nih.gov/books/NBK21097/

  3. Karlin, S., & Altschul, S. F. (1990). Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes. Proceedings of the National Academy of Sciences, 87, 2264–2268.

    Article  MATH  Google Scholar 

  4. Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J., Zhang, Z., Miller, W., & Lipman, D. J. (1997). Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Research, 25, 3389–3402.

    Article  Google Scholar 

  5. Zhang, Z., Schwartz, S., Wagner, L., & Miller, W. (2000). A greedy algorithm for aligning DNA sequences. Journal of Computational Biology, 7, 203–214.

    Article  Google Scholar 

  6. Kitano, T., & Saitou, N. (2000). Evolutionary history of the Rh blood group-related genes in vertebrates. Immunogenetics, 51, 856–862.

    Article  Google Scholar 

  7. Lipman, D. J., & Pearson, W. R. (1985). Rapid and sensitive protein similarity searches. Science, 227, 1435–1441.

    Article  Google Scholar 

  8. Pearson, W. R., & Lipman, D. J. (1988). Improved tools for biological sequence comparison. Proceedings of the National Academy of Sciences of the United States, 85, 2444–2448.

    Article  Google Scholar 

  9. Kent, W. J. (2002). BLAT – the BLAST-like alignment tool. Genome Research, 12, 656–664.

    MathSciNet  Google Scholar 

  10. http://genome.ucsc.edu/FAQ/FAQblat.html

  11. Ma, B., Tromp, J., & Li, M. (2002). PatternHunter: Faster and more sensitive homology search. Bioinformatics, 18, 440–445.

    Article  Google Scholar 

  12. http://www.bioinformaticssolutions.com/all-products/ph

  13. Eddy, S. R. (2009). A new generation of homology search tools based on probabilistic inference. Genome Informatics, 23, 205–211.

    Article  Google Scholar 

  14. Fin, R. D., Clements, J., & Eddy, S. R. (2011). HMMER web server: Interactive sequence similarity searching. Nucleic Acids Research, 39, W29–W37.

    Article  Google Scholar 

  15. Higgs, P. G., & Atwood, T. K. (2005). Bioinformatics and molecular evolution. Malden: Blackwell.

    Google Scholar 

  16. Chao, K.-M., & Zhang, L. (2008). Sequence comparison: Theory and methods (Computational biology series). London: Springer.

    Google Scholar 

  17. Saitou, N., & Ueda, S. (1994). Evolutionary rate of insertions and deletions in non-coding nucleotide sequences of primates. Molecular Biology and Evolution, 11, 504–512.

    Google Scholar 

  18. Needleman, S. B., & Wunsch, C. D. (1970). A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology, 48, 443–453.

    Article  Google Scholar 

  19. Sellers, P. H. (1974). On the theory and computation of evolutionary distances. SIAM Journal on Applied Mathematics, 26, 787–793.

    Article  MATH  MathSciNet  Google Scholar 

  20. Waterman, M. S., Smith, T. F., & Beyer, W. A. (1976). Some biological sequence metrics. Advances in Mathematics, 20, 367–387.

    Article  MATH  MathSciNet  Google Scholar 

  21. Gotoh, O. (1982). An improved algorithm for matching biological sequences. Journal of Molecular Biology, 162, 705–708.

    Article  Google Scholar 

  22. Altschul, S. F., & Erickson, B. W. (1986). A nonlinear measure of subalignment similarity and its significance levels. Bulletin of Mathematical Biology, 48, 603–616.

    MATH  MathSciNet  Google Scholar 

  23. Fitch, W. (1969). Locating gaps in amino acid sequences to optimize the homology between two proteins. Biochemical Genetics, 3, 99–108.

    Article  Google Scholar 

  24. Schulz, J., Florian Leese, F., & Held, C. (2011). Introduction to dot-plots. Web page available at http://www.code10.info/

  25. Kuroki, Y., Toyoda, A., Noguchi, H., Taylor, T. D., Itoh, T., Kim, D. S., Kim, D. W., Choi, S. H., Kim, I. C., Choi, H. H., Kim, Y. S., Satta, Y., Saitou, N., Yamada, T., Morishita, S., Hattori, M., Sakaki, Y., Park, H. S., & Fujiyama, A. (2006). Comparative analysis of chimpanzee and human Y chromosomes unveils complex evolutionary pathway. Nature Genetics, 38, 158–167.

    Article  Google Scholar 

  26. Murata, M., Richardson, J. S., & Sussman, J. L. (1985). Simultaneous comparison of three protein sequences. Proceedings of National Academy of Sciences, USA, 82, 3073–3077.

    Article  Google Scholar 

  27. Feng, D.-F., & Doolittle, R. F. (1987). Progressive sequence alignment as a prerequisite to correct phylogenetic trees. Journal of Molecular Evolution, 25, 351–360.

    Article  Google Scholar 

  28. Thompson, J. D., Higgins, D. G., & Gibson, T. J. (1994). CLUSTAL W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Research, 22, 4673–4680.

    Article  Google Scholar 

  29. Katoh, K., Misawa, K., Kuma, K., & Miyata, T. (2002). MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Research, 30, 3059–3066.

    Article  Google Scholar 

  30. Notredame, C. (2007). Recent evolutions of multiple sequence alignment algorithms. PLoS Computational Biology, 3, e123.

    Article  Google Scholar 

  31. Morgenstern, B., Dress, A., & Werner, T. (1996). Multiple DNA and protein sequence alignment based on segment-to-segment comparison. Proceedings of National Academy of Sciences, USA, 93, 12098–12103.

    Article  MATH  Google Scholar 

  32. Brudno, M., Do, C., Cooper, G., Kim, M. F., Davydov, E., Green, E. D., Sidow, A., & Batzoglou, S. (2003). LAGAN and multi-LAGAN: Efficient tools for large-scale multiple alignment of genomic DNA. Genome Research, 13, 721–731.

    Article  Google Scholar 

  33. Edgar, R. C. (2004). MUSCLE: Multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research, 32, 1792–1797.

    Article  Google Scholar 

  34. Bray, N., & Pachter, L. (2004). MAVID: Constrained ancestral alignment of multiple sequences. Genome Research, 14, 693–699.

    Article  Google Scholar 

  35. Darling, A. C. E., Mau, B., Blatter, F. R., & Perna, N. T. (2004). Mauve: Multiple alignment of conserved genomic sequence with rearrangements. Genome Research, 14, 1394–1403.

    Article  Google Scholar 

  36. Darling, A. C. E., Mau, B., & Perna, N. T. (2010). progressiveMauve: Multiple genome alignment with gene gain, loss and rearrangement. PLoS ONE, 5, e11147.

    Article  Google Scholar 

  37. Kryukov, K., & Saitou, N. (2010). MISHIMA – A new method for high speed multiple alignment of nucleotide sequences of bacterial genome scale data. BMC Bioinformatics, 11, 142.

    Article  Google Scholar 

  38. Popendorf, K., Tsuyoshi, H., Osana, Y., & Sakakibara, Y. (2010). Murasaki: A fast, parallelizable algorithm to find anchors from multiple genomes. PLoS ONE, 5, e12651.

    Article  Google Scholar 

  39. Higgins, D. G., & Sharp, P. (1988). CLUSTAL: A package for performing multiple sequence alignment on a microcomputer. Gene, 73, 237–244.

    Article  Google Scholar 

  40. Saitou, N., & Nei, M. (1987). The neighbor-joining method: A new method for reconstructing phylogenetic trees. Molecular Biology and Evolution, 4, 406–425.

    Google Scholar 

  41. Kimura, M. (1980). A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. Journal of Molecular Evolution, 16, 111–120.

    Article  Google Scholar 

  42. Kimura, M. (1983). The neutral theory of molecular evolution. Cambridge: Cambridge University Press.

    Book  Google Scholar 

  43. Higgins, D. G., Bleasby, A. J., & Fuchs, R. (1992). CLUSTAL V: Improved software for multiple sequence alignment. Computational Applied Biosciences, 8, 189–191.

    Google Scholar 

  44. Wilbur, W. J., & Lipman, D. (1984). The context dependent comparison of biological sequences. SIAM Journal of Applied Mathematics, 44, 557–567.

    Article  MATH  MathSciNet  Google Scholar 

  45. Myers, E. W., & Miller, W. (1988). Optimal alignments in linear space. CABIOS, 4, 11–17.

    Google Scholar 

  46. Larkin, M. A., Blackshields, G., Brown, N. P., et al. (13 co-authors) (2007) Clustal W and Clustal X version 2.0. Bioinformatics, 23, 2947–2948.

    Google Scholar 

  47. Sievers, F., Wilm, A., Dineen, D., Gibson, T. J., Karplus, K., Li, W., Lopez, R., McWilliam, H., Remmert, M., Söding, J., Thompson, J. D., & Higgins, D. G. (2011). Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Molecular Systems Biology, 7, 539.

    Article  Google Scholar 

  48. Felsenstein, J., Sawyer, S., & Kochin, R. (1982). An efficient method for matching nucleotide acid sequences. Nucleic Acids Research, 10, 133–139.

    Article  Google Scholar 

  49. Notredame, C., Higgins, D. G., & Heringa, J. (2000). T-Coffee: A novel method for fast and accurate multiple sequence alignment. Journal of Molecular Biology, 302, 205–217.

    Article  Google Scholar 

  50. Galtier, N., Gouy, M., & Gautier, C. (1996). SEA VIEW and PHYLO_WIN: Two graphic tools for sequence alignment and molecular phylogeny. Computer Applications in the Biosciences, 12, 543–548.

    Google Scholar 

  51. Lipman, D. J., Altschul, S. F., & Kececioglu, J. D. (1989). A tool for multiple sequence alignment. Proceedings of the National Academy of Sciences of the United States of America, 86, 4412–4415.

    Article  Google Scholar 

  52. Subramanian, A. R., Kaufmann, M., & Morgenstern, B. (2008). DIALIGN-TX: Greedy and progressive approaches for segment-based multiple sequence alignment. Algorithms for Molecular Biology, 3, 6.

    Article  Google Scholar 

  53. Bradley, R. K., Roberts, A., Smoot, M., Juvekar, S., Do, J., Dewey, C., Holmes, I., & Pachter, L. (2009). Fast statistical alignment. PLoS Computational Biology, 5, e1000392.

    Article  MathSciNet  Google Scholar 

  54. Bray, N., Dubchak, I., & Pachter, L. (2003). AVID: A global alignment program. Genome Research, 13, 97–102.

    Article  Google Scholar 

  55. Blanchette, M., Kent, W. J., Riemer, C., Elnitski, L., Smit, A. F. A., Roskin, K. M., Baertsch, R., Rosenbloom, K., Clawson, H., Green, E. D., Haussler, D., & Miller, W. (2004). Aligning multiple genomic sequences with the threaded blockset aligner. Genome Research, 14, 708–715.

    Article  Google Scholar 

  56. Brudno, M., Chapman, M., Gottgens, B., Batzoglou, S., & Morgenstern, B. (2003). Fast and sensitive multiple alignment of long genomic sequences. BMC Bioinformatics, 4, 66.

    Article  Google Scholar 

  57. Raphael, B., Zhi, D., Tang, H., & Pevzner, P. (2004). A novel method for multiple alignment of sequences with repeated and shuffled elements. Genome Research, 14, 2336–2346.

    Article  Google Scholar 

  58. Do, C. B., Mahabhashyam, M. S. P., Brudno, M., & Batzoglou, S. (2005). ProbCons: Probabilistic consistency-based multiple sequence alignment. Genome Research, 15, 330–340.

    Article  Google Scholar 

  59. Lassmann, T., & Sonnhammer, E. L. L. (2005). Kalign—An accurate and fast multiple sequence alignment algorithm. BMC Bioinformatics, 6, 298.

    Article  Google Scholar 

  60. Lotynoja, A., & Goldman, N. (2005). An algorithm for progressive multiple alignment of sequences with insertions. Proceedings of the National Academy of Sciences of the United States of America, 102, 10557–10562.

    Article  Google Scholar 

  61. Sze, S.-H., Lu, Y., & Yang, Q. (2006). A polynomial time solvable formulation of multiple sequence alignment. Journal of Computational Biology, 13, 309–319.

    Article  MathSciNet  Google Scholar 

  62. Liu, Y., Schmidt, B., & Maskell, D. L. (2010). MSAProbs: Multiple sequence alignment based on pair hidden Markov models and partition function posterior probabilities. Bioinformatics, 26, 1958–1964.

    Article  Google Scholar 

  63. Shih, A. C.-C., & Li, W.-H. (2003). GS-Aligner: A novel tool for aligning genomic sequences using bit-level operations. Molecular Biology and Evolution, 20, 1299–1309.

    Article  Google Scholar 

  64. Keightley, P. D., & Johnson, T. (2004). MCALIGN: Stochastic alignment of noncoding DNA sequences based on an evolutionary model of sequence evolution. Genome Research, 14, 442–450.

    Article  Google Scholar 

  65. Kurtz, S., Phillippy, A., Delcher, A. L., Smoot, M., Shumway, M., Antonescu, C., & Salzberg, S. L. (2004). Versatile and open software for comparing large genomes. Genome Biology, 5, R12.

    Article  Google Scholar 

  66. Schwartz, S., Zhang, Z., Frazer, K. A., Smit, A., Riemer, C., Bouck, J., Gibbs, R., Hardison, R., & Miller, W. (2000). PipMaker–A web server for aligning two genomic DNA sequences. Genome Research, 10, 577–586.

    Article  Google Scholar 

  67. http://genome.lbl.gov/vista/index.shtml

  68. Matsunami, M., Sumiyama, K., & Saitou, N. (2010). Evolution of conserved non-coding sequences within the vertebrate Hox clusters through the two-round whole genome duplications revealed by phylogenetic footprinting analysis. Journal of Molecular Evolution, 71, 427–436.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag London

About this chapter

Cite this chapter

Saitou, N. (2013). Sequence Homology Handling. In: Introduction to Evolutionary Genomics. Computational Biology, vol 17. Springer, London. https://doi.org/10.1007/978-1-4471-5304-7_14

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-5304-7_14

  • Published:

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-4471-5303-0

  • Online ISBN: 978-1-4471-5304-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics