Skip to main content

Exploring Familial Relationships Using Multiple Sequence Alignment

  • Protocol
  • 1271 Accesses

Part of the book series: Methods in Molecular Biology™ ((MIMB,volume 173))

Abstract

Over the course of the past 30 yr, a multitude of calcium-binding proteins has been discovered that employ several unique structural motifs for calciumion binding. The first prominent family identified bound calcium via a helixloop-helix structural motif, and was coined the EF-hand binding motif, as it occurs between the E and F helices of carp parvalbumin (1). Today, the EF-hand calcium-binding family is ubiquitous, with members implicated in varied roles such as calcium signaling cell response and calcium storage. More recently, other calcium-binding motifs such as those found in annexin repeats (2), C2 domain proteins (3), and EGF domain proteins (4) have been identified. Table 1 summarizes the characteristic amino acid sequence properties of each of these domains as provided in the PROSITE protein motif recognition database (5).

This is a preview of subscription content, log in via an institution.

Buying options

Protocol
USD   49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Springer Nature is developing a new tool to find and evaluate Protocols. Learn more

References

  1. Kretsinger, R. H., Nockolds, C. E., Coffee, C. J., and Bradshaw, R. A. (1972) The structure of a calcium-binding protein from carp muscle, Cold Spring Harb. Symp. Quant. Biol. 36, 217–20.

    PubMed  CAS  Google Scholar 

  2. Smith, P. D. and Moss, S. E. (1994) Structural evolution of the annexin supergene family. Trends Genet. 10, 241–246.

    Article  PubMed  CAS  Google Scholar 

  3. Nalefski, E. A. and Falke, J. J. (1996) The C2 domain calcium-binding motif: Structural and functional diversity, Protein Sci. 5, 2375–2390.

    Article  PubMed  CAS  Google Scholar 

  4. Handford, P. A., Mayhew, M., Baron, M., Winship, P. R., and Campbell, I. D., Brownlee, G. G. (1991) Key residues involved in calcium-binding motifs in EGF-like domains. Nature 351, 164–167.

    Article  PubMed  CAS  Google Scholar 

  5. Hofmann, K., Bucher, P., Falquet, L., and Bairoch, A. (1999) The PROSITE database, its status in 1999. Nucleic Acids Res. 27, 215–219.

    Article  PubMed  CAS  Google Scholar 

  6. Kawasaki, H. and Kretsinger, R. H. (1995) Calcium-binding Proteins 1:EF-hands. Protein Profiles 2, 305–490.

    CAS  Google Scholar 

  7. Heizmann, C. W. and Hunziker, W. (1991) Intracellular calcium-binding proteins: more sites than insights. Trends Biochem. Sci. 16, 98–103.

    Article  PubMed  CAS  Google Scholar 

  8. Nakayama, S., Moncrief, N. D., and Kretsinger, R. H. (1992) Evolution of EF-hand calcium-modulated proteins. II. Domains of several subfamilies have diverse evolutionary histories. J. Mol. Evol. 34, 416–448.

    Article  PubMed  CAS  Google Scholar 

  9. Kretsinger, R. H. and Nakayama, S. (1993) Evolution of EF-hand calcium-modulated proteins. IV. Exon shuffling did not determine the domain compositions of EF-hand proteins. J. Mol. Evol. 36, 477–488.

    Article  PubMed  CAS  Google Scholar 

  10. Kawasaki, H., Nakayama, S., and Kretsinger, R. H. (1998) Classification and evolution of EF-hand proteins. Biometals 11, 277–295.

    Article  PubMed  CAS  Google Scholar 

  11. Morgan, R. O. and Fernandez, M. P. (1997) Annexin gene structure and molecular evolutionary genetics. CellMol. Life Sci. 53, 508–515.

    Article  CAS  Google Scholar 

  12. Morgan, R. O. and Fernandez, M. P. (1997) Distinct annexin subfamilies in plants and protists diverged prior to animal annexins and from a common ancestor. J. Mol. Evol. 44, 178–188.

    Article  PubMed  CAS  Google Scholar 

  13. Heringa, J., Frishman, D., and Argos, P. (1997) Computational methods relating proteins sequence and structure, in Proteins: A Comprehensive Treatise, vol. I (Allen, G. ed.), JAI Press, Greenwich, Connecticut, pp. 165–268.

    Google Scholar 

  14. Heringa, J. (1999) Two strategies for sequence comparison: profile-preprocessed and secondary structure-induced multiple alignment. Comput. Chem. 23, 341–364.

    Article  PubMed  CAS  Google Scholar 

  15. Thompson, J. D., Higgins, D. G., and Gibson, T. J. (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, positions-specific gap penalties and weightmatrix choice. Nucleic Acids Res. 22, 4673–4680.

    Article  PubMed  CAS  Google Scholar 

  16. Bairoch, A. and Apweiler, R. (1999) The SWISS-PROT protein sequence data bank and its supplement TREMBL. Nucleic Acids Res. 27, 49–54.

    Article  PubMed  CAS  Google Scholar 

  17. Stoesser, G., Tuli, M. A., Lopez, R., and Sterk, P. (1999) The EMBL Nucleotide Sequence Database. Nucleic Acids Res. 27, 18–24.

    Article  PubMed  CAS  Google Scholar 

  18. Benson, D. A., Boguski, M. S., Lipman, D. J., Ostell, J., Ouellette, B. F., Rapp, B. A., and Wheeler, D. L. (1999) Genbank. Nucleic Acids Res. 27, 12–17.

    Article  PubMed  CAS  Google Scholar 

  19. Barker, W. C., Garavelli, J. S., McGarvey, P. B., Marzec, C. R., Orcutt, B. C., Srinivasarao, G. Y., et al. (1999) The PIR-international protein sequence database. Nucleic Acids Res. 27, 39–43.

    Article  PubMed  CAS  Google Scholar 

  20. Thompson, J. D., Gibson, T. J., Plewniak, F., Jeanmougin, F., and Higgins, D. G. (1997) The ClustalX windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 25, 4876–4882

    Article  PubMed  CAS  Google Scholar 

  21. Altschul, S. F., Gish, W., Miller W., Myers, E. W., Lipman, D. J. (1990) Basic local alignment search tool. J. Mol. Biol. 215, 403–410.

    PubMed  CAS  Google Scholar 

  22. Altschul, S. F., Madden, T. L., Schäffer, A. A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D. J. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402.

    Article  PubMed  CAS  Google Scholar 

  23. Etzold, T., Ulyanov, A., and Argos, P. (1996) SRS: information retrieval system for molecular biology data banks. Methods Enzymol. 266, 114–128.

    Article  PubMed  CAS  Google Scholar 

  24. Bateman, A., Birney, E., Durbin, R., Eddy, S. R., Finn, R. D., and Sonnhammer, E. L. L. (1999) Pfam 3.1: 1313 multiple alignments and profile HMMs match the majority of proteins. Nucleic Acids Res. 27, 260–262.

    Article  PubMed  CAS  Google Scholar 

  25. Heringa, J. and Argos, P. (1993) A method to recognize distant repeats in protein sequences. Proteins 17, 391–411.

    Article  PubMed  CAS  Google Scholar 

  26. Heringa, J., Sommerfeldt, H., Higgins, D., and Argos, P. (1992) OBSTRUCT: a program to obtain largest cliques from a protein sequence set according to structural resolution and sequence similarity. Comput. Appl. Biosci., 8, 599–600.

    PubMed  CAS  Google Scholar 

  27. Henikoff, S. and Henikoff, J. G. (1992). Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci. USA 89, 10,915–10,919.

    Article  PubMed  CAS  Google Scholar 

  28. Rost, B. and Sander, C. (1993) Prediction of protein secondary structure at better than 70% accuracy. J. Mol. Biol. 232, 584–599.

    Article  PubMed  CAS  Google Scholar 

  29. Frishman, D. and Argos, P. (1996) Incorporation of long-distance interactions in a secondary structure prediction method. Prot. Eng. 9, 133–142.

    Article  CAS  Google Scholar 

  30. Frishman, D. and Argos, P. (1997) Seventy-five percent accuracy in protein secondary structure prediction. Proteins 27, 329–335.

    Article  PubMed  CAS  Google Scholar 

  31. King, R. D. and Sternberg, M. J. E. (1996) Identification and application of the concepts important for accurate and reliable protein secondary structure prediction Prot. Sci. 5, 2298.

    Article  CAS  Google Scholar 

  32. Salamov, A. A. and Solovyev, V. V. (1995) Prediction of protein secondary structure by combining nearest-neighbor algorithms and multiple sequence alignments. J. Mol. Biol. 247, 11–15.

    Article  PubMed  CAS  Google Scholar 

  33. Zvelebil, M. J., Barton, G. J., Taylor, W. R. and Sternberg, M. J. E. (1987) Prediction of protein secondary structure and active sites using the alignment of homologous sequences. J. Mol. Biol. 195 957.

    Article  PubMed  CAS  Google Scholar 

  34. Cuff, J. A. and Barton, G. J. (1999) Evaluation and improvement of multiple sequence methods for protein secondary structure prediction. Proteins 34, 508–519.

    Article  PubMed  CAS  Google Scholar 

  35. Metha, P., Heringa, J., and Argos, P. (1995) A simple and fast approach to prediction of protein secondary structure from multiply aligned sequences with accuracy above 70%. Prot. Sci. 4, 2517–2525.

    Article  Google Scholar 

  36. Sali, A. and Blundell, T. L. (1993) Comparative protein modelling by satisfaction of spatial restraints. J. Mol. Biol. 234, 779–815.

    Article  PubMed  CAS  Google Scholar 

  37. Guex, N. and Peitsch, M. C. (1997) SWISS-MODEL and the Swiss-PdbViewer: an environment for comparative protein modelling. Electrophoresis 18, 2714–2723.

    Article  PubMed  CAS  Google Scholar 

  38. Saitou, N. and Nei, M. (1987) The neighbour-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4, 406–425.

    PubMed  CAS  Google Scholar 

  39. Sneath, P. H. and Sokal, R. R. (1973) Numerical Taxonomy. Freeman, San Francisco, California.

    Google Scholar 

  40. Needleman, S. B. and Wunsch, C. D. (1970) A general method applicable to the search of for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48, 443–453.

    Article  PubMed  CAS  Google Scholar 

  41. Smith, T. F. and Waterman, M. S. (1981) Identification of common molecular subsequences. J. Mol. Biol. 147, 195–197.

    Article  PubMed  CAS  Google Scholar 

  42. Baldi, P., Chauvin, Y., Hunkapiller, T., and McClure, M. A. (1994) Hidden Markov models of biological primary sequence information. Proc. Natl. Acad. Sci. USA 91, 1059–1063.

    Article  PubMed  CAS  Google Scholar 

  43. Krogh, A., Mian, I. S., Sjölander, K., and Haussler, D. (1994) Hidden Markov models in computational biology. J. Mol. Biol. 235, 1501–1531.

    Article  PubMed  CAS  Google Scholar 

  44. Notredame, C., Holm, L., Higgins, D. G. (1998) COFFEE: An objective function for multiple sequence alignments. Bioinformatics 14, 407–422.

    Article  PubMed  CAS  Google Scholar 

  45. Thompson, J. D., Plewniak, F., and Poch, O. (1999) A comprehensive comparison of multiple sequence alignment programs. Nucleic Acids Res. 27, 2682–2690.

    Article  PubMed  CAS  Google Scholar 

  46. Thompson, J. D., Plewniak, F., and Poch, O. (1999) BAliBASE: a benchmark alignment database for the evaluation of multiple sequence alignment programs. Bioinformatics 15, 87–88.

    Article  PubMed  CAS  Google Scholar 

  47. Gotoh, O. (1996) Significant improvement in accuracy of multiple protein sequence alignments by iterative refinement as assessed by reference to structural alignments. J. Mol. Biol. 264, 823–838.

    Article  PubMed  CAS  Google Scholar 

  48. Morgenstern, B., Dress, A. and Werner, T. (1996) Multiple DNA and protein sequence alignment based on segment-to-segment comparison. Proc. Natl. Acad. Sci. USA 93, 12,098–12,103.

    Article  PubMed  CAS  Google Scholar 

  49. Eddy, S. R. (1995) Multiple alignment using hidden Markov models, in Proceedings of the Third International Conference on Intelligent Systems for Molecular Biology (Rawlings, C., Clark, D., Altman, R., Hunter, H., Hengauer, T., and Wodak, S., eds.), AAAI Press, pp. 114–120.

    Google Scholar 

  50. Lawrence, C. E., Altschul, S. F., Boguski, M. S., Liu, J. S., Neuwald, A. F., and Wootton, J. C. (1993) Detecting subtle sequence signals: the Gibbs sampling strategy for multiple alignment. Science 262, 208–214.

    Article  PubMed  CAS  Google Scholar 

  51. Dayhoff, M. O., Barker, W. C., and Hunt L. T. (1983) Establishing homologies in protein sequences. Methods Enzymol. 91, 524–545.

    Article  PubMed  CAS  Google Scholar 

  52. Gonnet, G. H., Cohen, M. A., and Benner, S. A. (1992) Exhaustive matching of the entire protein sequence database. Science 256, 1443–1445.

    Article  PubMed  CAS  Google Scholar 

  53. Taylor, W. R. (1988) A flexible method to align large numbers of biological sequences. J. Mol. Evol. 28, 161–169.

    Article  PubMed  CAS  Google Scholar 

  54. Camin, J. H. and Sokal, R. R. (1965) Computer comparison of new and existing criteria for constructing evolutionary trees from sequence data. J. Mol. Evol. 19, 9–19.

    Google Scholar 

  55. Eck, R. V. and Dayhoff, M. O. (1966) in Atlas of Protein Sequence and Structure Natl. Biomed. Res. Found., Silver Spring, Maryland.

    Google Scholar 

  56. Fitch, W. M. and Margoliash, E. (1967). Construction of phylogenetic trees. Science 155, 279–284.

    Article  PubMed  CAS  Google Scholar 

  57. Felsenstein, J. (1981) Evolutionary trees from DNA sequences: a maximum likelihood approach. J. Mol. Evol. 17, 368–376.

    Article  PubMed  CAS  Google Scholar 

  58. Adachi, J. and Hasegawa, M. (1996) MOLPHY v. 2.3: programs for molecular phylogenetics based on maximum likelihood. Comp. Sci. Monographs, 28, 1–150. Institute of Statistical Mathematics, Tokyo.

    Google Scholar 

  59. Hogeweg, P. and Hesper, B. (1984) The alignment of sets of sequences and the construction of phyletic trees: an integrated method. J. Mol. Evol. 20, 175–186.

    Article  PubMed  CAS  Google Scholar 

  60. Kimura, M. (1983) The Neutral Theory of Molecular Evolution Cambridge University Press, Cambridge, England.

    Google Scholar 

  61. Felsenstein, J. (1989). PHYLIP-phylogeny inference package (version 3.2). Cla-distics 5, 164–166.

    Google Scholar 

  62. Felsenstein, J. (1990) PHYLIP Manual version 3.3. University Herbarium, University of California, Berkeley, California.

    Google Scholar 

  63. Felsenstein, J. (1996) Inferring phylogenies from protein sequences by parsimony, distance, and likelihood methods. Methods Enzymol. 266, 418–427.

    Article  PubMed  CAS  Google Scholar 

  64. Felsenstein, J. (1985) Confidence limits on phylogenies: an approach using the bootstrap. J. Evol. 39, 783–791.

    Article  Google Scholar 

  65. Murata, M., Richardson, J. S., and Sussman, J. L. (1985) Simultanious comparison of three protein sequences. Proc. Natl. Acad. Sci. USA 82, 3073–3077.

    Article  PubMed  CAS  Google Scholar 

  66. Carillo, H. and Lipman, D. J. (1988) The multiple sequence alignment. SIAM J. Appl. Math. 48, 1073–1082.

    Article  Google Scholar 

  67. Lipman, D. J., Altschul, S. F., and Kececioglu, J. D. (1985) A tool for multiple sequence alignment. Proc. Natl. Acad. Sci. USA 86, 4412–4415.

    Article  Google Scholar 

  68. Feng, D. F. and Doolittle, R. F. (1987) Progressive sequence alignment as a prerequisite to correct phylogenetic trees. J. Mol. Evol. 25, 351–360.

    Article  PubMed  CAS  Google Scholar 

  69. Barton, G. J. and Sternberg, J. E. (1987) A strategy for the rapid multiple alignment of protein sequences: confidence levels from tertiary tructure comparisons. J. Mol. Biol. 198, 327–337.

    Article  PubMed  CAS  Google Scholar 

  70. Genetics Computer Group. Program manual for the GCG Package, Version 8. 575 Science Drive, Madison, Wisconsin.

    Google Scholar 

  71. Thompson, J. D., Gibson, T. J., Plewniak, F., Jeanmougin, F., and Higgins, D. G. (1997) The ClustalX windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 25, 4876–4882.

    Article  PubMed  CAS  Google Scholar 

  72. Notredame, C. and Higgins, D. G. (1996) SAGA: sequence alignment by genetic algorithm. Nucleic Acids Res. 24, 1515–1524.

    Article  PubMed  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Humana Press Inc.

About this protocol

Cite this protocol

Weljie, A.M., Heringa, J. (2002). Exploring Familial Relationships Using Multiple Sequence Alignment. In: Vogel, H.J. (eds) Calcium-Binding Protein Protocols: Volume 2: Methods and Techniques. Methods in Molecular Biology™, vol 173. Springer, Totowa, NJ. https://doi.org/10.1385/1-59259-184-1:231

Download citation

  • DOI: https://doi.org/10.1385/1-59259-184-1:231

  • Publisher Name: Springer, Totowa, NJ

  • Print ISBN: 978-0-89603-689-5

  • Online ISBN: 978-1-59259-184-8

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics