In Silico Proteomics

Bock, Joel R.; Gough, David A.

doi:10.1007/978-1-59259-414-6_13

Joel R. Bock &
David A. Gough

244 Accesses

Abstract

This Handbook of Proteomic Methods largely comprises current experimental technologies to identify, quantify, and characterize expressed proteins and their interactions within cells, tissues, and body fluids. These techniques have evolved rapidly with an impetus from the industrial biotechnology sector. Nevertheless, experimental elucidation of all proteomic constituents within an organism and the documentation of their interactions remain formidable tasks. This is further complicated by the broad diversity in protein expression guaranteed by alternative splicing of pre-mRNA or post-translational modifications. In one dramatic example, more than 38,000 different isoforms of Down syndrome cell adhesion molecule (DSCAM) were observed in Drosophila melanogaster (1). Obviously, the combinatorics required for comprehensively explicating all protein—protein interactions, especially for higher eukaryotes, are prohibitive, even with the use of advanced high-throughput approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.00; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Schmucker, D., Clemens, J. C., Shu, H., et al. (2000) Drosophila DSCAM is an axon guidance receptor exhibiting extraordinary molecular diversity. Cell 101, 671–684.
CAS Google Scholar
Fung, Y. C. (1993) Biomechanics: Mechanical Properties of Living Tissues, 2nd ed. Springer-Verlag, New York.
Google Scholar
Spellman, P. T. and Rubin, G. M. (2002) Evidence for large domains of similarly expressed genes in the Drosophila genome. J. Biol. 1, 5.1–5. 8.
Google Scholar
Boser, B. E., Guyon, I. M., and Vapnik, V. N. (1992) A training algorithm for optimal margin classifiers, in Proceedings of the Fifth Annual ACM Workshop on Computational Learning Theory ( Haussler, D., ed.), ACM Press, Pittsburgh, PA, pp. 144–152.
Chapter Google Scholar
Vapnik, V. N. (1995) The Nature of Statistical Learning Theory. Springer-Verlag, Heidelberg, Germany.
Google Scholar
Bock, J. R. and Gough, D. A. (2001) Predicting protein-protein interactions from primary structure. Bioinformatics 17, 455–460.
Article PubMed CAS Google Scholar
Xenarios, I., Rice, D. W., Salwinski, L., Baron, M. K., Marcotte, E. M., and Eisenberg, D. (2000) DIP: The database of interacting proteins. Nucleic Acids Res. 28, 289–291.
Article PubMed CAS Google Scholar
Kandel, D., Mathias, Y., Unger, R., and Winkler, P. (1996) Shuffling biological sequences. Discrete Appl. Math. 71, 171–185.
Article Google Scholar
Eisenberg, D. (1984) Three-dimensional structure of membrane and surface proteins. Ann. Rev. Biochem. 53, 595–623.
Article PubMed CAS Google Scholar
Bull, H. B. and Breese, K. (1974) Surface tension of amino acid solutions: a hydrophobicity scale of the amino acid residues. Arch. Biochem. Biophys. 161, 665–670.
Article PubMed CAS Google Scholar
Provost, F., Fawcett, T., and Kohavi, R. (1998) The case against accuracy estimation for comparing induction algorithms, in Proceedings of the Fifteenth International Conference on Machine Learning (IMLC-98), Morgan Kaufmann, San Francisco, CA, pp. 445–453.
Google Scholar
Weiss, G. M. and Provost, F. (2001) The effect of class distribution on classifier learning: an empirical study. Technical Report ML-TR-44, Department of Computer Science, Rutgers University.
Google Scholar
Swingler, K. (1996) Applying Neural Networks: A Practical Guide. Academic, London, UK.
Google Scholar
Kwok, J. T. (1999) Moderating the outputs of support vector machine classifiers. IEEE Trans. Neural Net. 10, 1018–1031.
Article CAS Google Scholar
Platt, J. C. (1999) Fast training of support vector machines using sequential minimal optimization, in Advances in Kernel Methods: Support Vector Learning, MIT Press, Cambridge, MA, pp. 185–208.
Google Scholar
Witten, I. H. and Frank, E. (1999) Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco, CA.
Google Scholar
Elkan, C. (2001) The foundations of cost-sensitive learning, in Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence (IJCAI), Seattle, WA, pp. 973–978.
Google Scholar
Bock, J. R. and Gough, D. A. (2003) Machine learning inference of protein-protein binding in Saccharomyces cerevisiae,in review.
Google Scholar
Goffeau, A., Barrell, B. G., Bussey, H., et al. (1996) Life with 6000 genes. Science 274, 563–567.
Article Google Scholar
Chervitz, S. A., Aravind, L., Sherlock, G., Ball, C. A., Koonin, E. V., and Dwight, S. S. (1998) Comparison of the complete protein sets of worm and yeast: orthology and divergence. Science 282, 2022–2028.
Article PubMed CAS Google Scholar
Mumberg, D., Muller, R., and Funk, M. (1995) Yeast vectors for the controlled expression of heterologous proteins in different genetic backgrounds. Gene 156, 119–122.
Article PubMed CAS Google Scholar
Munder, T. and Hinnen, A. (1999) Yeast cells as tools for target-oriented screening. Appl. Microbiol. Biotechnol. 52, 311–320.
Article PubMed CAS Google Scholar
Ito, T., Chiba, T., Ozawa, R., Yoshida, M., Hattori, M., and Sakaki, Y. (2001) A comprehensive two-hydrid analysis to explore the yeast protein interactome. Proc. Natl. Acad. Sci. USA 98, 4569–4574.
Article PubMed CAS Google Scholar
Bartel, P., Chien, C. T., Sternglanz, R., and Fields, S. (1993) Elimination of false positives that arise in using the two-hybrid system. Biotechniques 14, 920–924.
PubMed CAS Google Scholar
Smith, T. F. and Waterman, W. S. (1981) Identification of common molecular subsequences. J. Mol. Biol. 147, 195–197.
Article PubMed CAS Google Scholar
Altschul, S. F. and Gish, W. (1996) Local alignment statistics. Methods Enzymol. 266, 460–480.
Article PubMed CAS Google Scholar
Henikoff, S. and Henikoff, J. G. (1992) Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci. USA 89, 10,915–10, 519.
Google Scholar
Kohavi, R. and Provost, F. (1998) Glossary of terms. Machine Learning 30, 271–274.
Article Google Scholar
Peterson, W. W. and Birdsall, T. G. (1953) The theory of signal detectability. Technical Report TR-13, Communications and Signal Processing Laboratory, University of Michigan, Ann Arbor, MI.
Google Scholar
Stone, M. (1974) Cross-validatory choices and assessment of statistical predictions. J. Roy. Stat. Soc. 36, 111–147.
Google Scholar
Skolnik, M. I. (1980) Introduction to Radar Systems, 2nd ed. McGraw-Hill, New York.
Google Scholar
Urick, R. J. (1983) Principles of Underwater Sound, 3rd ed. McGraw-Hill, New York.
Google Scholar
Druker, B. J., Talpaz, M. T., Resta, D. J., et al. (2001) Efficacy and safety of a specific inhibitor of the BCR-ABL tyrosine kinase in chronic myeloid leukemia and acute lymphoblastic leukemia. N. Engl. J. Med. 344, 1031–1037.
Article PubMed CAS Google Scholar
Black, D. L. (2000) Protein diversity from alternative splicing: a challenge for bioinformatics and post-genome biology. Cell 103, 367–370.
Article PubMed CAS Google Scholar
Bock, J. R. and Gough, D. A. (2003) Whole-proteome interaction mining. Bioinformatics 19 125–135.
Google Scholar
Bradley, P. S., Fayyad, U. M., and Mangasarian, O. L. (1998) Mathematical programming for data mining: formulations and challenges. Technical Report MSR-98–01, University of Wisconsin Data Mining Institute, Madison, WI.
Google Scholar
Rain, J. C., Selig, L., De Reuse, H., et al. (2001) The protein-protein interaction map of Helicobacter pylori. Nature 409, 211–215.
CAS Google Scholar
Burges, C. (1998) A tutorial on support vector machines for pattern recognition. Data Mining Knowledge Discovery 2, 121–167.
Article Google Scholar
Sankoff, D. Leduc, G., Paquin, B., Lang, B. F., and Cedergren, R. (1992) Gene order comparisons of phylogenetic inference: evolution of the mitochondrial genome. Proc. Natl. Acad. Sci. USA 89 6575–6579.
Google Scholar
Tekaia, F., Lazcano, A., and Dujon, B. (l 999) The genomic tree as revealed from whole proteome comparisons. Genome Res. 9, 550–557.
Google Scholar
Brown, J. R. Douady, C. J., Italia, M. J. Marshall, W. E., and Stanhope, M. H. (2001) Universal trees based on large combined protein sequence data sets. Nat. Genet. 28 281–285.
Google Scholar
Efron, B. and Gong, G. (1983) A leisurely look at the bootstrap, the jackknife, and cross-validation. Am. Stat. 37, 36–48.
Google Scholar
Eisen, J. A. (2000) Assessing evolutionary relationships among microbes from wholegenome analysis. Curr. Opin. Microbiol. 3, 475–480.
Article PubMed CAS Google Scholar
Milo, R., Shen-Orr, S., Itzkovitz, S., Kashtan, N., Chklovskii, D., and Alon, U. (2002) Network motifs: simple building blocks of complex networks. Science 298, 824–827.
Google Scholar
Klumpp, S. and Krieglstein, J. (2002) Phosphorylation and dephosphorylation of histidine residues in proteins. Eur. J. Biochem. 269, 1067–1071.
Article PubMed CAS Google Scholar
Alberts, B., Bray, D., Lewis, J. Raff, M., Roberts, K., and Watson, J. D. (1989) Molecular Biology of the Cell,2nd ed. New York.
Google Scholar
Bairoch, A., Bucher, P., and Hofmann, K. (1997) The PROSITE database, its status in 1997. Nucleic Acids Res. 25, 217–221.
Article PubMed CAS Google Scholar
Matsushita, M. and Janda, K. D. (2002) Histidine kinases as targets for new antimicrobial agents. Bioorg. Med. Chem. 10, 855–867.
Article PubMed CAS Google Scholar
Andrews, S. C. (1998) Iron storage in bacteria. Adv. Microb. Physiol. 40, 281–351.
Article PubMed CAS Google Scholar
Jeong, H., Mason, S. P., Barabâsi, A.-L., and Oltvai, Z. N. (2001) Lethality and centrality in protein networks. Nature 411, 41–42.
Google Scholar
Cunningham, M. J. (2000) Genomics and proteomics: the new millennium of drug discovery and development. J. Pharmacol. Toxicol. Methods 44, 291–300.
Article PubMed CAS Google Scholar
Bissantz, C., Folkers, G., and Rognan, D. (2000) Protein-based virtual screening of chemical databases. 1. Evaluation of different docking/scoring combinations. J. Med. Chem. 43, 4759–4767.
Article PubMed CAS Google Scholar
Waszkowycz, B. (2002) Structure-based approaches to drug design and virtual screening. Curr. Opin. Drug Discovery Dev. 5, 407–413.
CAS Google Scholar
Langer, T. and Hoffmann, R. D. (2001) Virtual screening: an effective tool for lead structure discovery? Curr. Pharma. Design 7, 509–527.
Article CAS Google Scholar
Gohlke, H. and Klebe, G. (2001) Statistical potentials and scoring functions applied to protein-ligand binding. Curr. Opin. Struct. Biol. 11, 231–235.
Article PubMed CAS Google Scholar
Böhm, H. J. (1998) Prediction of binding constants of protein ligands: a fast method for the prioritization of hits obtained from de novo design or 3D database search programs. J. Comput. Aided Mol. Design 12 309–323.
Google Scholar
Moret, E. E., van Wijk, M. C., Kostense, A. S., and Gillies, M. B. (1999) Scoring peptide(mimetic)-protein interactions. Med. Chem. Res. 9, 604–620.
CAS Google Scholar
Bock, J. R. and Gough, D. A. (2002) A new method to estimate ligand-receptor energetics. Mol. Cell. Proteomics 1, 904–910.
Google Scholar
Smola, A. J. and Schölkopf, B. (1998) A tutorial on support vector regression. Technical Report NC-TR-98–030, Royal Holloway College, University of London, London.
Google Scholar
Ortiz, A. R., Pisabarro, M. T., Gago, F., and Wade, R. C. (1995) Prediction of drug binding affinities by comparative binding energy analysis. J. Med. Chem. 38, 2681–2691.
Google Scholar
Chen, Y. Z. and Zhi, D. G. (2001) Ligand-protein inverse docking and its potential use in the computer search of protein targets of a small molecule. Proteins 43, 217–226.
Article PubMed CAS Google Scholar
Berman, H. M., Westbrook, J., Feng, Z., et al. (2000) The Protein Data Bank. Nucleic Acids Res. 28, 235–242.
Article PubMed CAS Google Scholar
Weininger, D. (1988) SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J. Chem. Inform. Comput. Sci. 28, 31–36.
Article CAS Google Scholar
Wegner, J. and Zell, A. (2002) JOELib: a Java based computational chemistry package, in 6th Darmstädter Molecular-Modelling Workshop, Technische Universität, Darmstadt, Germany.
Google Scholar
Burden, F. R. (1989) Molecular identification number for substructure searches. J. Chem. Inform. Comput. Sci. 29, 225–227.
Article CAS Google Scholar
Boikess, R. S. and Edelson, E. (1981) Chemical Principles, 2nd ed. Harper & Row, New York.
Google Scholar
Golub, G. H. and van Loan, C. F. (1989) Matrix Computations, 2nd ed. Johns Hopkins University Press, Baltimore, MD.
Google Scholar
Gershenfeld, N. A. and Weigend, A. S. (1993) The Future of Time Series: Learning and Understanding, vol. XV of Sante Fe Institute Studies in the Sciences of Complexity. Addison- Wesley, Reading, MA, pp. 1–70.
Google Scholar
Kendall, M. G. (1938) A new measure of rank correlation. Biometrika 30, 81–93.
Google Scholar
Head, R. D., Smythe, M. L., Oprea, T. I., Waller, C. L., Green, S. M., and Marshall, G. R. (1996) VALIDATE: a new method for the receptor-based prediction of binding affinities of novel ligands. J. Amer. Chem. Soc. 118, 3959–3969.
Article CAS Google Scholar
Wang, R., Liu, L., Lai, L., and Tang, Y. (1998) SCORE: a new empirical method for estimating the binding affinity of a protein-ligand complex. J. Mol. Modeling 4, 379–394.
Article CAS Google Scholar
Schwikowski, B., Uetz, P., and Fields, S. (2000) A network of protein-protein interactions in yeast. Nat. Biotechnol. 18, 1257–1261.
Article PubMed CAS Google Scholar
Wojcik, J. and Schächter, V. (2001) Protein-protein interaction map inference using interacting domain profile pairs. Bioinformatics 17 (suppl. 1), S296 - S305.
Article PubMed Google Scholar
Uetz, P., Goit, L., Cagney, G., et al. (2000) A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature 403, 623–627.
CAS Google Scholar
Tucker, C. L., Gera, J. F., and Uetz, P. (2001) Towards an understanding of complex protein networks. Trends Cell Biol. 11, 102–106.
Article PubMed CAS Google Scholar
Walhout, A., Boulton, S., and Vidal, M. (2000) Yeast two-hybrid systems and protein interaction mapping projects for yeast and worm. Yeast 17, 88–94.
Article PubMed CAS Google Scholar
Wang, R., Lai, L., and Wang, S. (2002) Further development and validation of empirical scoring functions for structure-based binding affinity prediction. J. Comput. Aided Mol. Design 16, 11–26.
Article CAS Google Scholar
Rarey, M., Kramer, B., Bernd, C., and Lengauer, T. (1996) Time-efficient docking of similar flexible ligands, in Biocomputing: Proceedings of the 1996 Pacific Symposium, Hunter, L. and Klein, T., eds., January 3–6, World Scientific Publishing, Singapore.
Google Scholar
Zhang, T. and Koshland, D. E. (1996) Computational method for relative binding energies of enzyme-substrate complexes. Protein Sci. 5, 348–356.
Article PubMed CAS Google Scholar
Schapira, M., Totrov, M., and Abagyan, R. (1999) Prediction of the binding energy for small molecules, peptides and proteins. J. Mol. Recog. 12, 177–190.
Article CAS Google Scholar

Download references

Authors

Joel R. Bock
View author publications
You can also search for this author in PubMed Google Scholar
David A. Gough
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Oregon National Primate Research Center, Oregon Health and Science University, Beaverton, OR, USA
P. Michael Conn

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Bock, J.R., Gough, D.A. (2003). In Silico Proteomics. In: Conn, P.M. (eds) Handbook of Proteomic Methods. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-59259-414-6_13

Download citation

DOI: https://doi.org/10.1007/978-1-59259-414-6_13
Publisher Name: Humana Press, Totowa, NJ
Print ISBN: 978-1-61737-504-0
Online ISBN: 978-1-59259-414-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics