Protein Sequence Analysis and Domain Identification

Ponting, Chris P.; Birney, Ewan

doi:10.1385/1-59259-890-0:527

Chris P. Ponting² &
Ewan Birney³

Part of the book series: Springer Protocols Handbooks ((SPH))

4130 Accesses
1 Citations

Abstract

The fundamental unit of protein structure is the domain, defined as a region or regions of a polypeptide that folds independently and possesses a hydrophobic core with a hydrophilic exterior (see Note 1). Domains, particularly those with enzymatic activities, may possess functions independently of whether they are present in isolation or are part of a larger multidomain protein. Other domains confer regulatory and specificity properties to multidomain proteins, usually via the provision of binding sites. Because the majority of eukaryotic proteins, and a large number of eubacterial and archaeal proteins, are multidomain in character, the determination of the structures and functions of these proteins requires detailed consideration of their domain architectures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Hardcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Doolittle, R. F. (1995) The multiplicity of domains in proteins. Annu. Rev. Biochem. 64, 287–314.
Article PubMed CAS Google Scholar
Ponting, C. P. and Russell, R. B. (2002) The natural history of protein domains. Annu. Rev. Biophys. Biomol. Struct. 31,45–71.
Article PubMed CAS Google Scholar
Mathe, C., Sagot, M. F., Schiex, T., and Rouze, P. (2002) Current methods of gene prediction, their strengths and weaknesses. Nucl. Acids Res. 30, 4103–4117.
Article PubMed CAS Google Scholar
Bork, P. and Gibson, T. J. (1996) Applying motif and profile searches. Methods Enzymol. 266, 162–184.
Article PubMed CAS Google Scholar
Ponting, C. P., Schultz, J., Copley, R. R., Andrade, M. A., and Bork, P. (2000) Evolution of domain families. Adv. Prot. Chem. 54, 185–244.
Article CAS Google Scholar
Jonassen, I. (2000) Discovering patterns conserved in sets of unaligned protein sequences. Methods Mol. Biol. 143, 33–52.
PubMed CAS Google Scholar
Karlin, S. and Altschul, S. F. (1990) Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes. Proc. Natl. Acad. Sci. USA 87, 2264–2268.
Article PubMed CAS Google Scholar
Pearson, W. R. and Miller, W. (1992) Dynamic programming algorithms for biological sequence comparison. Methods Enzymol. 210, 575–601.
Article PubMed CAS Google Scholar
Lupas, A. (1996) Coiled coils: new structures and new functions. Trends Biochem. Sci. 21, 375–382.
PubMed CAS Google Scholar
Altschul, S. F., Boguski, M. S., Gish, W., and Wootton, J. C. (1994) Issues in searching molecular sequence databases. Nat. Genet. 6, 119–129.
Article PubMed CAS Google Scholar
Altschul, S. F., Madden, T. L., Schäffer, A. A., et al. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucl. Acids Res. 25, 3389–3402.
Article PubMed CAS Google Scholar
Pearson, W. R. (1991) Searching protein sequence libraries: comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithms. Genomics 11, 635–650.
Article Google Scholar
Wootton, J. C. and Federhen, S. (1996) Analysis of compositionally biased regions in sequence databases. Methods Enzymol. 266, 554–571.
Article PubMed CAS Google Scholar
Schäffer, A. A., Wolf, Y. I., Ponting, C. P., Koonin, E. V., Aravind, L., and Altschul, S. F. (1999) IMPALA: matching a protein sequence against a collection of PSI-BLAST-constructed position-specific score matrices. Bioinformatics 15, 1000–1011.
Article PubMed Google Scholar
Birney, E., Thompson, J.D., and Gibson, T. J. (1996) PairWise and SearchWise: finding the optimal alignment in a simultaneous comparison of a protein profile against all DNA translation frames. Nucleic Acids Res. 24, 2730–2739.
Article PubMed CAS Google Scholar
Birney, E. and Durbin, R. (2000) Using GeneWise in the Drosophila annotation experiment. Genome Res. 10, 547–548.
Article PubMed CAS Google Scholar
Russell, R. B. (1994) Domain insertion. Protein Eng. 7, 1407–1410.
Article PubMed CAS Google Scholar
Eddy, S. R. (1996) Hidden Markov models. Curr. Opin. Struct. Biol. 6, 361–365.
Article PubMed CAS Google Scholar
Tatusov, R. L., Altschul, S. F., and Koonin, E. V. (1994) Detection of conserved segments in proteins: iterative scanning or sequence databases with alignment blocks. Proc. Natl. Acad. Sci. USA 91, 12,091–12,095.
Article PubMed CAS Google Scholar
Dickens, N. J. and Ponting, C. P. (2003) THoR: a tool for domain discovery and curation of multiple alignments. Genome Biol. 4, R52.
Article PubMed Google Scholar
Ponting, C. P., Bork, P., Schultz, J., and Aravind, L. (1999) No Sec7-homology domain in guanine-nucleotide-exchange factors that act on Ras and Rho. Trends Biochem. Sci. 24, 177–178.
Article PubMed CAS Google Scholar
Barnes. M. R., Russell, R. B., Copley, R. R., et al. (1999) A lipid-binding domain in Wnt: a case of mistaken identity? Current Biol. 9, R717–R718.
Article CAS Google Scholar
Copley, R. R., Ponting, C. P., and Bork, P. (1999) Phospholipases A2 and Wnts are unlikely to share a common ancestor. Current Biol. 9, R718.
Article CAS Google Scholar
Fitch, W. M. (1970) Distinguishing homologues from analogous proteins. Syst. Zool. 19, 99–113.
Article Google Scholar
Fitch, W. M. (1995) Uses for evolutionary trees. Philos. Trans. R. Soc. Lond. B Biol.Sci. 349, 93–102.
Article Google Scholar
Ponting, C. P. (2001) Issues in predicting protein function from sequence. Brief. Bioinform. 2, 19–29.
Article PubMed CAS Google Scholar
Mott, R. (1992) Maximum-likelihood estimation of the statistical distribution of Smith-Waterman local sequence similarity scores. Bull. Math. Biol 54, 59–75.
Google Scholar
Altschul, S. F. and Gish, W. (1996) Local alignments statistics. Methods Enzymol. 266, 460–480.
Article PubMed CAS Google Scholar
Krogh, A., Brown, M., Mian, I. S., Sjolander, K., and Haussler, D. (1994) Hidden Markov models in computational biology. Applications to protein modeling. J. Mol. Biol. 235, 1501–1531.
Article PubMed CAS Google Scholar
Henikoff, S. and Henikoff, J. G. (1992) Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci. USA 89, 10915–10919.
Article PubMed CAS Google Scholar
Benner, S. A., Cohen, M. A., and Gonnet, G. H. (1994) Amino acid substitution during functionally constrained divergent evolution of protein sequences. Protein Eng. 7, 1323–1332.
Article PubMed CAS Google Scholar
Brenner, S. E., Chothia, C., and Hubbard, T. J. P. (1998) Assessing sequence comparison methods with reliable structurally identified distant evolutionary relationships. Proc. Natl. Acad. Sci. USA 95, 6073–6078.
Article PubMed CAS Google Scholar
Thompson, J. D., Higgins, D. G., and Gibson, T. J. (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673–4680.
Article PubMed CAS Google Scholar
Gribskov, M. and Veretnik, S. (1996) Identification of sequence pattern with profile analysis. Methods Enzymol. 266, 198–212.
Article PubMed CAS Google Scholar
Karplus, K. (1995) Evaluating regularizers of estimating distributions of amino acids. ISMB 3, 188–196.
PubMed CAS Google Scholar
Lindqvist, Y. and Schneider, G. (1997) Circular permutations of natural protein sequences: structural evidence. Curr. Opin. Struct. Biol. 7, 422–427.
Article PubMed CAS Google Scholar
Uliel, S., Fliess, A., Amir, A., and Unger, R. (1999) A simple algorithm for detecting circular permutations in proteins. Bioinformatics 15, 930–936.
Article PubMed CAS Google Scholar
Weimbs, T., Low, S. H., Chapin, S. J., Mostov, K. E., Bucher, P., and Hofmann, K. (1997) A conserved domain is present in different families of vesicular fusion proteins: a new superfamily. Proc. Natl. Acad. Sci. USA 94, 3046–3051.
Article PubMed CAS Google Scholar

Download references

Author information

Authors and Affiliations

MRC Functional Genetics Unit University of Oxford, Department of Human Anatomy and Genetics, Oxford, UK
Chris P. Ponting
European Bioinformatics Institute, Cambridge, UK
Ewan Birney

Authors

Chris P. Ponting
View author publications
You can also search for this author in PubMed Google Scholar
Ewan Birney
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Hertfordshire, Hatfield, UK
John M. Walker

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Ponting, C.P., Birney, E. (2005). Protein Sequence Analysis and Domain Identification. In: Walker, J.M. (eds) The Proteomics Protocols Handbook. Springer Protocols Handbooks. Humana Press. https://doi.org/10.1385/1-59259-890-0:527

Download citation

DOI: https://doi.org/10.1385/1-59259-890-0:527
Publisher Name: Humana Press
Print ISBN: 978-1-58829-343-5
Online ISBN: 978-1-59259-890-8
eBook Packages: Springer Protocols

Publish with us

Policies and ethics