Abstract
The fundamental unit of protein structure is the domain, defined as a region or regions of a polypeptide that folds independently and possesses a hydrophobic core with a hydrophilic exterior (see Note 1). Domains, particularly those with enzymatic activities, may possess functions independently of whether they are present in isolation or are part of a larger multidomain protein. Other domains confer regulatory and specificity properties to multidomain proteins, usually via the provision of binding sites. Because the majority of eukaryotic proteins, and a large number of eubacterial and archaeal proteins, are multidomain in character, the determination of the structures and functions of these proteins requires detailed consideration of their domain architectures.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Doolittle, R. F. (1995) The multiplicity of domains in proteins. Annu. Rev. Biochem. 64, 287–314.
Ponting, C. P. and Russell, R. B. (2002) The natural history of protein domains. Annu. Rev. Biophys. Biomol. Struct. 31,45–71.
Mathe, C., Sagot, M. F., Schiex, T., and Rouze, P. (2002) Current methods of gene prediction, their strengths and weaknesses. Nucl. Acids Res. 30, 4103–4117.
Bork, P. and Gibson, T. J. (1996) Applying motif and profile searches. Methods Enzymol. 266, 162–184.
Ponting, C. P., Schultz, J., Copley, R. R., Andrade, M. A., and Bork, P. (2000) Evolution of domain families. Adv. Prot. Chem. 54, 185–244.
Jonassen, I. (2000) Discovering patterns conserved in sets of unaligned protein sequences. Methods Mol. Biol. 143, 33–52.
Karlin, S. and Altschul, S. F. (1990) Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes. Proc. Natl. Acad. Sci. USA 87, 2264–2268.
Pearson, W. R. and Miller, W. (1992) Dynamic programming algorithms for biological sequence comparison. Methods Enzymol. 210, 575–601.
Lupas, A. (1996) Coiled coils: new structures and new functions. Trends Biochem. Sci. 21, 375–382.
Altschul, S. F., Boguski, M. S., Gish, W., and Wootton, J. C. (1994) Issues in searching molecular sequence databases. Nat. Genet. 6, 119–129.
Altschul, S. F., Madden, T. L., Schäffer, A. A., et al. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucl. Acids Res. 25, 3389–3402.
Pearson, W. R. (1991) Searching protein sequence libraries: comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithms. Genomics 11, 635–650.
Wootton, J. C. and Federhen, S. (1996) Analysis of compositionally biased regions in sequence databases. Methods Enzymol. 266, 554–571.
Schäffer, A. A., Wolf, Y. I., Ponting, C. P., Koonin, E. V., Aravind, L., and Altschul, S. F. (1999) IMPALA: matching a protein sequence against a collection of PSI-BLAST-constructed position-specific score matrices. Bioinformatics 15, 1000–1011.
Birney, E., Thompson, J.D., and Gibson, T. J. (1996) PairWise and SearchWise: finding the optimal alignment in a simultaneous comparison of a protein profile against all DNA translation frames. Nucleic Acids Res. 24, 2730–2739.
Birney, E. and Durbin, R. (2000) Using GeneWise in the Drosophila annotation experiment. Genome Res. 10, 547–548.
Russell, R. B. (1994) Domain insertion. Protein Eng. 7, 1407–1410.
Eddy, S. R. (1996) Hidden Markov models. Curr. Opin. Struct. Biol. 6, 361–365.
Tatusov, R. L., Altschul, S. F., and Koonin, E. V. (1994) Detection of conserved segments in proteins: iterative scanning or sequence databases with alignment blocks. Proc. Natl. Acad. Sci. USA 91, 12,091–12,095.
Dickens, N. J. and Ponting, C. P. (2003) THoR: a tool for domain discovery and curation of multiple alignments. Genome Biol. 4, R52.
Ponting, C. P., Bork, P., Schultz, J., and Aravind, L. (1999) No Sec7-homology domain in guanine-nucleotide-exchange factors that act on Ras and Rho. Trends Biochem. Sci. 24, 177–178.
Barnes. M. R., Russell, R. B., Copley, R. R., et al. (1999) A lipid-binding domain in Wnt: a case of mistaken identity? Current Biol. 9, R717–R718.
Copley, R. R., Ponting, C. P., and Bork, P. (1999) Phospholipases A2 and Wnts are unlikely to share a common ancestor. Current Biol. 9, R718.
Fitch, W. M. (1970) Distinguishing homologues from analogous proteins. Syst. Zool. 19, 99–113.
Fitch, W. M. (1995) Uses for evolutionary trees. Philos. Trans. R. Soc. Lond. B Biol.Sci. 349, 93–102.
Ponting, C. P. (2001) Issues in predicting protein function from sequence. Brief. Bioinform. 2, 19–29.
Mott, R. (1992) Maximum-likelihood estimation of the statistical distribution of Smith-Waterman local sequence similarity scores. Bull. Math. Biol 54, 59–75.
Altschul, S. F. and Gish, W. (1996) Local alignments statistics. Methods Enzymol. 266, 460–480.
Krogh, A., Brown, M., Mian, I. S., Sjolander, K., and Haussler, D. (1994) Hidden Markov models in computational biology. Applications to protein modeling. J. Mol. Biol. 235, 1501–1531.
Henikoff, S. and Henikoff, J. G. (1992) Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci. USA 89, 10915–10919.
Benner, S. A., Cohen, M. A., and Gonnet, G. H. (1994) Amino acid substitution during functionally constrained divergent evolution of protein sequences. Protein Eng. 7, 1323–1332.
Brenner, S. E., Chothia, C., and Hubbard, T. J. P. (1998) Assessing sequence comparison methods with reliable structurally identified distant evolutionary relationships. Proc. Natl. Acad. Sci. USA 95, 6073–6078.
Thompson, J. D., Higgins, D. G., and Gibson, T. J. (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673–4680.
Gribskov, M. and Veretnik, S. (1996) Identification of sequence pattern with profile analysis. Methods Enzymol. 266, 198–212.
Karplus, K. (1995) Evaluating regularizers of estimating distributions of amino acids. ISMB 3, 188–196.
Lindqvist, Y. and Schneider, G. (1997) Circular permutations of natural protein sequences: structural evidence. Curr. Opin. Struct. Biol. 7, 422–427.
Uliel, S., Fliess, A., Amir, A., and Unger, R. (1999) A simple algorithm for detecting circular permutations in proteins. Bioinformatics 15, 930–936.
Weimbs, T., Low, S. H., Chapin, S. J., Mostov, K. E., Bucher, P., and Hofmann, K. (1997) A conserved domain is present in different families of vesicular fusion proteins: a new superfamily. Proc. Natl. Acad. Sci. USA 94, 3046–3051.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Humana Press Inc., Totowa, NJ
About this protocol
Cite this protocol
Ponting, C.P., Birney, E. (2005). Protein Sequence Analysis and Domain Identification. In: Walker, J.M. (eds) The Proteomics Protocols Handbook. Springer Protocols Handbooks. Humana Press. https://doi.org/10.1385/1-59259-890-0:527
Download citation
DOI: https://doi.org/10.1385/1-59259-890-0:527
Publisher Name: Humana Press
Print ISBN: 978-1-58829-343-5
Online ISBN: 978-1-59259-890-8
eBook Packages: Springer Protocols