Exploring Familial Relationships Using Multiple Sequence Alignment

Weljie, Aalim M.; Heringa, Jaap

doi:10.1385/1-59259-184-1:231

Exploring Familial Relationships Using Multiple Sequence Alignment

Aalim M. Weljie² &
Jaap Heringa³

Protocol

1271 Accesses

Part of the book series: Methods in Molecular Biology™ ((MIMB,volume 173))

Abstract

Over the course of the past 30 yr, a multitude of calcium-binding proteins has been discovered that employ several unique structural motifs for calciumion binding. The first prominent family identified bound calcium via a helixloop-helix structural motif, and was coined the EF-hand binding motif, as it occurs between the E and F helices of carp parvalbumin (1). Today, the EF-hand calcium-binding family is ubiquitous, with members implicated in varied roles such as calcium signaling cell response and calcium storage. More recently, other calcium-binding motifs such as those found in annexin repeats (2), C2 domain proteins (3), and EGF domain proteins (4) have been identified. Table 1 summarizes the characteristic amino acid sequence properties of each of these domains as provided in the PROSITE protein motif recognition database (5).

This is a preview of subscription content, log in via an institution.

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.00; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Springer Nature is developing a new tool to find and evaluate Protocols. Learn more

References

Kretsinger, R. H., Nockolds, C. E., Coffee, C. J., and Bradshaw, R. A. (1972) The structure of a calcium-binding protein from carp muscle, Cold Spring Harb. Symp. Quant. Biol. 36, 217–20.
PubMed CAS Google Scholar
Smith, P. D. and Moss, S. E. (1994) Structural evolution of the annexin supergene family. Trends Genet. 10, 241–246.
Article PubMed CAS Google Scholar
Nalefski, E. A. and Falke, J. J. (1996) The C2 domain calcium-binding motif: Structural and functional diversity, Protein Sci. 5, 2375–2390.
Article PubMed CAS Google Scholar
Handford, P. A., Mayhew, M., Baron, M., Winship, P. R., and Campbell, I. D., Brownlee, G. G. (1991) Key residues involved in calcium-binding motifs in EGF-like domains. Nature 351, 164–167.
Article PubMed CAS Google Scholar
Hofmann, K., Bucher, P., Falquet, L., and Bairoch, A. (1999) The PROSITE database, its status in 1999. Nucleic Acids Res. 27, 215–219.
Article PubMed CAS Google Scholar
Kawasaki, H. and Kretsinger, R. H. (1995) Calcium-binding Proteins 1:EF-hands. Protein Profiles 2, 305–490.
CAS Google Scholar
Heizmann, C. W. and Hunziker, W. (1991) Intracellular calcium-binding proteins: more sites than insights. Trends Biochem. Sci. 16, 98–103.
Article PubMed CAS Google Scholar
Nakayama, S., Moncrief, N. D., and Kretsinger, R. H. (1992) Evolution of EF-hand calcium-modulated proteins. II. Domains of several subfamilies have diverse evolutionary histories. J. Mol. Evol. 34, 416–448.
Article PubMed CAS Google Scholar
Kretsinger, R. H. and Nakayama, S. (1993) Evolution of EF-hand calcium-modulated proteins. IV. Exon shuffling did not determine the domain compositions of EF-hand proteins. J. Mol. Evol. 36, 477–488.
Article PubMed CAS Google Scholar
Kawasaki, H., Nakayama, S., and Kretsinger, R. H. (1998) Classification and evolution of EF-hand proteins. Biometals 11, 277–295.
Article PubMed CAS Google Scholar
Morgan, R. O. and Fernandez, M. P. (1997) Annexin gene structure and molecular evolutionary genetics. CellMol. Life Sci. 53, 508–515.
Article CAS Google Scholar
Morgan, R. O. and Fernandez, M. P. (1997) Distinct annexin subfamilies in plants and protists diverged prior to animal annexins and from a common ancestor. J. Mol. Evol. 44, 178–188.
Article PubMed CAS Google Scholar
Heringa, J., Frishman, D., and Argos, P. (1997) Computational methods relating proteins sequence and structure, in Proteins: A Comprehensive Treatise, vol. I (Allen, G. ed.), JAI Press, Greenwich, Connecticut, pp. 165–268.
Google Scholar
Heringa, J. (1999) Two strategies for sequence comparison: profile-preprocessed and secondary structure-induced multiple alignment. Comput. Chem. 23, 341–364.
Article PubMed CAS Google Scholar
Thompson, J. D., Higgins, D. G., and Gibson, T. J. (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, positions-specific gap penalties and weightmatrix choice. Nucleic Acids Res. 22, 4673–4680.
Article PubMed CAS Google Scholar
Bairoch, A. and Apweiler, R. (1999) The SWISS-PROT protein sequence data bank and its supplement TREMBL. Nucleic Acids Res. 27, 49–54.
Article PubMed CAS Google Scholar
Stoesser, G., Tuli, M. A., Lopez, R., and Sterk, P. (1999) The EMBL Nucleotide Sequence Database. Nucleic Acids Res. 27, 18–24.
Article PubMed CAS Google Scholar
Benson, D. A., Boguski, M. S., Lipman, D. J., Ostell, J., Ouellette, B. F., Rapp, B. A., and Wheeler, D. L. (1999) Genbank. Nucleic Acids Res. 27, 12–17.
Article PubMed CAS Google Scholar
Barker, W. C., Garavelli, J. S., McGarvey, P. B., Marzec, C. R., Orcutt, B. C., Srinivasarao, G. Y., et al. (1999) The PIR-international protein sequence database. Nucleic Acids Res. 27, 39–43.
Article PubMed CAS Google Scholar
Thompson, J. D., Gibson, T. J., Plewniak, F., Jeanmougin, F., and Higgins, D. G. (1997) The ClustalX windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 25, 4876–4882
Article PubMed CAS Google Scholar
Altschul, S. F., Gish, W., Miller W., Myers, E. W., Lipman, D. J. (1990) Basic local alignment search tool. J. Mol. Biol. 215, 403–410.
PubMed CAS Google Scholar
Altschul, S. F., Madden, T. L., Schäffer, A. A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D. J. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402.
Article PubMed CAS Google Scholar
Etzold, T., Ulyanov, A., and Argos, P. (1996) SRS: information retrieval system for molecular biology data banks. Methods Enzymol. 266, 114–128.
Article PubMed CAS Google Scholar
Bateman, A., Birney, E., Durbin, R., Eddy, S. R., Finn, R. D., and Sonnhammer, E. L. L. (1999) Pfam 3.1: 1313 multiple alignments and profile HMMs match the majority of proteins. Nucleic Acids Res. 27, 260–262.
Article PubMed CAS Google Scholar
Heringa, J. and Argos, P. (1993) A method to recognize distant repeats in protein sequences. Proteins 17, 391–411.
Article PubMed CAS Google Scholar
Heringa, J., Sommerfeldt, H., Higgins, D., and Argos, P. (1992) OBSTRUCT: a program to obtain largest cliques from a protein sequence set according to structural resolution and sequence similarity. Comput. Appl. Biosci., 8, 599–600.
PubMed CAS Google Scholar
Henikoff, S. and Henikoff, J. G. (1992). Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci. USA 89, 10,915–10,919.
Article PubMed CAS Google Scholar
Rost, B. and Sander, C. (1993) Prediction of protein secondary structure at better than 70% accuracy. J. Mol. Biol. 232, 584–599.
Article PubMed CAS Google Scholar
Frishman, D. and Argos, P. (1996) Incorporation of long-distance interactions in a secondary structure prediction method. Prot. Eng. 9, 133–142.
Article CAS Google Scholar
Frishman, D. and Argos, P. (1997) Seventy-five percent accuracy in protein secondary structure prediction. Proteins 27, 329–335.
Article PubMed CAS Google Scholar
King, R. D. and Sternberg, M. J. E. (1996) Identification and application of the concepts important for accurate and reliable protein secondary structure prediction Prot. Sci. 5, 2298.
Article CAS Google Scholar
Salamov, A. A. and Solovyev, V. V. (1995) Prediction of protein secondary structure by combining nearest-neighbor algorithms and multiple sequence alignments. J. Mol. Biol. 247, 11–15.
Article PubMed CAS Google Scholar
Zvelebil, M. J., Barton, G. J., Taylor, W. R. and Sternberg, M. J. E. (1987) Prediction of protein secondary structure and active sites using the alignment of homologous sequences. J. Mol. Biol. 195 957.
Article PubMed CAS Google Scholar
Cuff, J. A. and Barton, G. J. (1999) Evaluation and improvement of multiple sequence methods for protein secondary structure prediction. Proteins 34, 508–519.
Article PubMed CAS Google Scholar
Metha, P., Heringa, J., and Argos, P. (1995) A simple and fast approach to prediction of protein secondary structure from multiply aligned sequences with accuracy above 70%. Prot. Sci. 4, 2517–2525.
Article Google Scholar
Sali, A. and Blundell, T. L. (1993) Comparative protein modelling by satisfaction of spatial restraints. J. Mol. Biol. 234, 779–815.
Article PubMed CAS Google Scholar
Guex, N. and Peitsch, M. C. (1997) SWISS-MODEL and the Swiss-PdbViewer: an environment for comparative protein modelling. Electrophoresis 18, 2714–2723.
Article PubMed CAS Google Scholar
Saitou, N. and Nei, M. (1987) The neighbour-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4, 406–425.
PubMed CAS Google Scholar
Sneath, P. H. and Sokal, R. R. (1973) Numerical Taxonomy. Freeman, San Francisco, California.
Google Scholar
Needleman, S. B. and Wunsch, C. D. (1970) A general method applicable to the search of for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48, 443–453.
Article PubMed CAS Google Scholar
Smith, T. F. and Waterman, M. S. (1981) Identification of common molecular subsequences. J. Mol. Biol. 147, 195–197.
Article PubMed CAS Google Scholar
Baldi, P., Chauvin, Y., Hunkapiller, T., and McClure, M. A. (1994) Hidden Markov models of biological primary sequence information. Proc. Natl. Acad. Sci. USA 91, 1059–1063.
Article PubMed CAS Google Scholar
Krogh, A., Mian, I. S., Sjölander, K., and Haussler, D. (1994) Hidden Markov models in computational biology. J. Mol. Biol. 235, 1501–1531.
Article PubMed CAS Google Scholar
Notredame, C., Holm, L., Higgins, D. G. (1998) COFFEE: An objective function for multiple sequence alignments. Bioinformatics 14, 407–422.
Article PubMed CAS Google Scholar
Thompson, J. D., Plewniak, F., and Poch, O. (1999) A comprehensive comparison of multiple sequence alignment programs. Nucleic Acids Res. 27, 2682–2690.
Article PubMed CAS Google Scholar
Thompson, J. D., Plewniak, F., and Poch, O. (1999) BAliBASE: a benchmark alignment database for the evaluation of multiple sequence alignment programs. Bioinformatics 15, 87–88.
Article PubMed CAS Google Scholar
Gotoh, O. (1996) Significant improvement in accuracy of multiple protein sequence alignments by iterative refinement as assessed by reference to structural alignments. J. Mol. Biol. 264, 823–838.
Article PubMed CAS Google Scholar
Morgenstern, B., Dress, A. and Werner, T. (1996) Multiple DNA and protein sequence alignment based on segment-to-segment comparison. Proc. Natl. Acad. Sci. USA 93, 12,098–12,103.
Article PubMed CAS Google Scholar
Eddy, S. R. (1995) Multiple alignment using hidden Markov models, in Proceedings of the Third International Conference on Intelligent Systems for Molecular Biology (Rawlings, C., Clark, D., Altman, R., Hunter, H., Hengauer, T., and Wodak, S., eds.), AAAI Press, pp. 114–120.
Google Scholar
Lawrence, C. E., Altschul, S. F., Boguski, M. S., Liu, J. S., Neuwald, A. F., and Wootton, J. C. (1993) Detecting subtle sequence signals: the Gibbs sampling strategy for multiple alignment. Science 262, 208–214.
Article PubMed CAS Google Scholar
Dayhoff, M. O., Barker, W. C., and Hunt L. T. (1983) Establishing homologies in protein sequences. Methods Enzymol. 91, 524–545.
Article PubMed CAS Google Scholar
Gonnet, G. H., Cohen, M. A., and Benner, S. A. (1992) Exhaustive matching of the entire protein sequence database. Science 256, 1443–1445.
Article PubMed CAS Google Scholar
Taylor, W. R. (1988) A flexible method to align large numbers of biological sequences. J. Mol. Evol. 28, 161–169.
Article PubMed CAS Google Scholar
Camin, J. H. and Sokal, R. R. (1965) Computer comparison of new and existing criteria for constructing evolutionary trees from sequence data. J. Mol. Evol. 19, 9–19.
Google Scholar
Eck, R. V. and Dayhoff, M. O. (1966) in Atlas of Protein Sequence and Structure Natl. Biomed. Res. Found., Silver Spring, Maryland.
Google Scholar
Fitch, W. M. and Margoliash, E. (1967). Construction of phylogenetic trees. Science 155, 279–284.
Article PubMed CAS Google Scholar
Felsenstein, J. (1981) Evolutionary trees from DNA sequences: a maximum likelihood approach. J. Mol. Evol. 17, 368–376.
Article PubMed CAS Google Scholar
Adachi, J. and Hasegawa, M. (1996) MOLPHY v. 2.3: programs for molecular phylogenetics based on maximum likelihood. Comp. Sci. Monographs, 28, 1–150. Institute of Statistical Mathematics, Tokyo.
Google Scholar
Hogeweg, P. and Hesper, B. (1984) The alignment of sets of sequences and the construction of phyletic trees: an integrated method. J. Mol. Evol. 20, 175–186.
Article PubMed CAS Google Scholar
Kimura, M. (1983) The Neutral Theory of Molecular Evolution Cambridge University Press, Cambridge, England.
Google Scholar
Felsenstein, J. (1989). PHYLIP-phylogeny inference package (version 3.2). Cla-distics 5, 164–166.
Google Scholar
Felsenstein, J. (1990) PHYLIP Manual version 3.3. University Herbarium, University of California, Berkeley, California.
Google Scholar
Felsenstein, J. (1996) Inferring phylogenies from protein sequences by parsimony, distance, and likelihood methods. Methods Enzymol. 266, 418–427.
Article PubMed CAS Google Scholar
Felsenstein, J. (1985) Confidence limits on phylogenies: an approach using the bootstrap. J. Evol. 39, 783–791.
Article Google Scholar
Murata, M., Richardson, J. S., and Sussman, J. L. (1985) Simultanious comparison of three protein sequences. Proc. Natl. Acad. Sci. USA 82, 3073–3077.
Article PubMed CAS Google Scholar
Carillo, H. and Lipman, D. J. (1988) The multiple sequence alignment. SIAM J. Appl. Math. 48, 1073–1082.
Article Google Scholar
Lipman, D. J., Altschul, S. F., and Kececioglu, J. D. (1985) A tool for multiple sequence alignment. Proc. Natl. Acad. Sci. USA 86, 4412–4415.
Article Google Scholar
Feng, D. F. and Doolittle, R. F. (1987) Progressive sequence alignment as a prerequisite to correct phylogenetic trees. J. Mol. Evol. 25, 351–360.
Article PubMed CAS Google Scholar
Barton, G. J. and Sternberg, J. E. (1987) A strategy for the rapid multiple alignment of protein sequences: confidence levels from tertiary tructure comparisons. J. Mol. Biol. 198, 327–337.
Article PubMed CAS Google Scholar
Genetics Computer Group. Program manual for the GCG Package, Version 8. 575 Science Drive, Madison, Wisconsin.
Google Scholar
Thompson, J. D., Gibson, T. J., Plewniak, F., Jeanmougin, F., and Higgins, D. G. (1997) The ClustalX windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 25, 4876–4882.
Article PubMed CAS Google Scholar
Notredame, C. and Higgins, D. G. (1996) SAGA: sequence alignment by genetic algorithm. Nucleic Acids Res. 24, 1515–1524.
Article PubMed CAS Google Scholar

Download references

Author information

Authors and Affiliations

Department of Biological Sciences, University of Calgary, Calgary, AB, Canada
Aalim M. Weljie
Division of Mathematical Biology, MRC National Institute for Medical Research, London, UK
Jaap Heringa

Authors

Aalim M. Weljie
View author publications
You can also search for this author in PubMed Google Scholar
Jaap Heringa
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Biological Sciences, University of Calgary Calgary, AB, Canada
Hans J. Vogel

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Weljie, A.M., Heringa, J. (2002). Exploring Familial Relationships Using Multiple Sequence Alignment. In: Vogel, H.J. (eds) Calcium-Binding Protein Protocols: Volume 2: Methods and Techniques. Methods in Molecular Biology™, vol 173. Springer, Totowa, NJ. https://doi.org/10.1385/1-59259-184-1:231

Download citation

DOI: https://doi.org/10.1385/1-59259-184-1:231
Publisher Name: Springer, Totowa, NJ
Print ISBN: 978-0-89603-689-5
Online ISBN: 978-1-59259-184-8
eBook Packages: Springer Protocols

Publish with us

Policies and ethics