Skip to main content

Identification of Domains from Protein Sequences

  • Protocol
Protein Structure Prediction

Part of the book series: Methods in Molecular Biology™ ((MIMB,volume 143))

  • 886 Accesses

Abstract

The fundamental unit of protein structure is the domain, defined as a region or regions of a polypeptide that fold independently and possesses a hydrophobic core (see Note 1). Domains, particularly those with enzymatic activities, may possess functions independently of whether they are present in isolation or else part of a larger multidomain protein. Other domains confer regulatory and specificity properties to multidomain proteins usually via the provision of binding sites. Because the majority of eukaryotic proteins, and a large number of eubacterial and archaeal proteins, are multidomain in character, the determination of the structures and functions of these proteins requires detailed consideration of their domain architectures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Doolittle, R. F. (1995) The multiplicity of domains in proteins. Annu. Rev. Biochem. 64, 287–314.

    Article  PubMed  CAS  Google Scholar 

  2. Bork, P. and Gibson, T. J. (1996) Applying motif and profile searches. Methods Enzymol. 266, 162–184.

    Article  PubMed  CAS  Google Scholar 

  3. Karlin, S. and Altschul, S. F. (1990) Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes. Proc. Natl. Acad. Sci. USA 87, 2264–2268.

    Article  PubMed  CAS  Google Scholar 

  4. Pearson, W. R. and Miller, W. (1992) Dynamic programming algorithms for biological sequence comparison. Methods Enzymol. 210, 575–601.

    Article  PubMed  CAS  Google Scholar 

  5. Lupas, A. (1996) Coiled coils: new structures and new functions. Trends Biochem. Sci. 21, 375–382.

    PubMed  CAS  Google Scholar 

  6. Altschul, S. F., Boguski, M. S., Gish, W., and Wootton, J. C. (1994) Issues in searching molecular sequence databases. Nat. Genet. 6, 119–129.

    Article  PubMed  CAS  Google Scholar 

  7. Altschul, S. F., Madden, T. L., Schäffer, A. A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D. J. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402.

    Article  PubMed  CAS  Google Scholar 

  8. Pearson, W. R. (1991) Searching protein sequence libraries: comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithms. Genomics 11, 635–650.

    Article  PubMed  CAS  Google Scholar 

  9. Wootton, J. C. and Federhen, S. (1996) Analysis of compositionally biased regions in sequence databases. Methods Enzymol. 266, 554–571.

    Article  PubMed  CAS  Google Scholar 

  10. Birney, E., Thompson, J. D., and Gibson, T. J. (1996) PairWise and SearchWise: finding the optimal alignment in a simultaneous comparison of a protein profile against all DNA translation frames. Nucleic Acids Res. 24, 2730–2739.

    Article  PubMed  CAS  Google Scholar 

  11. Russell, R. B. (1994) Domain insertion. Protein Eng. 7, 1407–1410.

    Article  PubMed  CAS  Google Scholar 

  12. Eddy, S. R. (1996) Hidden Markov models. Curr. Opin. Struct. Biol. 6, 361–365.

    Article  PubMed  CAS  Google Scholar 

  13. Tatusov, R. L., Altschul, S. F., and Koonin, E. V. (1994) Detection of conserved segments in proteins: iterative scanning of sequence databases with alignment blocks. Proc. Natl. Acad. Sci. USA 91, 12,091–12,095

    Article  PubMed  CAS  Google Scholar 

  14. Neuwald, A. F., Liu, J. S., Lipman, D. J., and Lawrence, C. E. (1997) Extracting protein alignment models from the sequence database. Nucleic Acids Res. 25, 1665–1677.

    Article  PubMed  CAS  Google Scholar 

  15. Altschul, S. F. and Gish, W. (1996) Local alignment statistics. Methods Enzymol. 266, 460–480.

    Article  PubMed  CAS  Google Scholar 

  16. Krogh, A., Brown, M., Mian, I. S., Sjolander, K., Haussler, D. (1994) Hidden Markov models in computational biology. Applications to protein modeling. J. Mol. Biol. 235, 1501–1531.

    Article  PubMed  CAS  Google Scholar 

  17. Henikoff, S. and Henikoff, J. G. (1992) Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci. USA 89, 10,915–10,919.

    Article  PubMed  CAS  Google Scholar 

  18. Benner, S. A., Cohen, M. A., and Gonnet, G. H. (1994) Amino acid substitution during functionally constrained divergent evolution of protein sequences. Protein Eng. 7, 1323–1332.

    Article  PubMed  CAS  Google Scholar 

  19. Brenner, S. E., Chothia, C., and Hubbard, T. J. P. (1998) Assessing sequence comparison methods with reliable structurally identified distant evolutionary relationships. Proc. Natl. Acad. Sci. 95, 6073–6078.

    Article  PubMed  CAS  Google Scholar 

  20. Thompson, J. D., Higgins, D. G., and Gibson, T. J. (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673–4680.

    Article  PubMed  CAS  Google Scholar 

  21. Gribskov, M. and Veretnik, S. (1996) Identification of sequence pattern with profile analysis. Methods Enzymol. 266, 198–212.

    Article  PubMed  CAS  Google Scholar 

  22. Karplus, K. (1995) Evaluating regularizers of estimating distributions of amino acids. Ismb 3, 188–196.

    PubMed  CAS  Google Scholar 

  23. Lindqvist, Y. and Schneider, G. (1997) Circular permutations of natural protein sequences: structural evidence. Curr. Opin. Struct. Biol. 7, 422–427.

    Article  PubMed  CAS  Google Scholar 

  24. Weimbs, T., Low, S. H., Chapin, S. J., Mostov, K. E., Bucher, P., and Hofmann, K. (1997) A conserved domain is present in different families of vesicular fusion proteins: a new superfamily. Proc. Natl. Acad. Sci. USA 94, 3046–3051.

    Article  PubMed  CAS  Google Scholar 

  25. Fitch, W. M. (1970) Distinguishing homologues from analogous proteins. Syst. Zool. 19, 99–113.

    Article  PubMed  CAS  Google Scholar 

  26. Fitch, W. M. (1995) Uses for evolutionary trees. Philos. Trans. R. Soc. Lond. B Biol. Sci. 349, 93–102.

    Article  PubMed  CAS  Google Scholar 

  27. Ponting, C. P. and Kerr, I. D. (1996) A novel family of phospholipase D homologues that includes phospholipid synthases and putative endonucleases: identification of duplicated repeats and potential active site residues. Protein Sci. 5, 914–922.

    Article  PubMed  CAS  Google Scholar 

  28. Koonin, E. V. (1996) A duplicated catalytic motif in a new superfamily of phosphohydrolases and phospholipid synthases that includes poxvirus envelope proteins. Trends Biochem. Sci. 21, 242–243.

    PubMed  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Humana Press Inc.

About this protocol

Cite this protocol

Ponting, C.P., Birney, E. (2000). Identification of Domains from Protein Sequences. In: Webster, D.M. (eds) Protein Structure Prediction. Methods in Molecular Biology™, vol 143. Humana Press. https://doi.org/10.1385/1-59259-368-2:53

Download citation

  • DOI: https://doi.org/10.1385/1-59259-368-2:53

  • Publisher Name: Humana Press

  • Print ISBN: 978-0-89603-637-6

  • Online ISBN: 978-1-59259-368-2

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics