Abstract
This Handbook of Proteomic Methods largely comprises current experimental technologies to identify, quantify, and characterize expressed proteins and their interactions within cells, tissues, and body fluids. These techniques have evolved rapidly with an impetus from the industrial biotechnology sector. Nevertheless, experimental elucidation of all proteomic constituents within an organism and the documentation of their interactions remain formidable tasks. This is further complicated by the broad diversity in protein expression guaranteed by alternative splicing of pre-mRNA or post-translational modifications. In one dramatic example, more than 38,000 different isoforms of Down syndrome cell adhesion molecule (DSCAM) were observed in Drosophila melanogaster (1). Obviously, the combinatorics required for comprehensively explicating all protein—protein interactions, especially for higher eukaryotes, are prohibitive, even with the use of advanced high-throughput approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Schmucker, D., Clemens, J. C., Shu, H., et al. (2000) Drosophila DSCAM is an axon guidance receptor exhibiting extraordinary molecular diversity. Cell 101, 671–684.
Fung, Y. C. (1993) Biomechanics: Mechanical Properties of Living Tissues, 2nd ed. Springer-Verlag, New York.
Spellman, P. T. and Rubin, G. M. (2002) Evidence for large domains of similarly expressed genes in the Drosophila genome. J. Biol. 1, 5.1–5. 8.
Boser, B. E., Guyon, I. M., and Vapnik, V. N. (1992) A training algorithm for optimal margin classifiers, in Proceedings of the Fifth Annual ACM Workshop on Computational Learning Theory ( Haussler, D., ed.), ACM Press, Pittsburgh, PA, pp. 144–152.
Vapnik, V. N. (1995) The Nature of Statistical Learning Theory. Springer-Verlag, Heidelberg, Germany.
Bock, J. R. and Gough, D. A. (2001) Predicting protein-protein interactions from primary structure. Bioinformatics 17, 455–460.
Xenarios, I., Rice, D. W., Salwinski, L., Baron, M. K., Marcotte, E. M., and Eisenberg, D. (2000) DIP: The database of interacting proteins. Nucleic Acids Res. 28, 289–291.
Kandel, D., Mathias, Y., Unger, R., and Winkler, P. (1996) Shuffling biological sequences. Discrete Appl. Math. 71, 171–185.
Eisenberg, D. (1984) Three-dimensional structure of membrane and surface proteins. Ann. Rev. Biochem. 53, 595–623.
Bull, H. B. and Breese, K. (1974) Surface tension of amino acid solutions: a hydrophobicity scale of the amino acid residues. Arch. Biochem. Biophys. 161, 665–670.
Provost, F., Fawcett, T., and Kohavi, R. (1998) The case against accuracy estimation for comparing induction algorithms, in Proceedings of the Fifteenth International Conference on Machine Learning (IMLC-98), Morgan Kaufmann, San Francisco, CA, pp. 445–453.
Weiss, G. M. and Provost, F. (2001) The effect of class distribution on classifier learning: an empirical study. Technical Report ML-TR-44, Department of Computer Science, Rutgers University.
Swingler, K. (1996) Applying Neural Networks: A Practical Guide. Academic, London, UK.
Kwok, J. T. (1999) Moderating the outputs of support vector machine classifiers. IEEE Trans. Neural Net. 10, 1018–1031.
Platt, J. C. (1999) Fast training of support vector machines using sequential minimal optimization, in Advances in Kernel Methods: Support Vector Learning, MIT Press, Cambridge, MA, pp. 185–208.
Witten, I. H. and Frank, E. (1999) Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco, CA.
Elkan, C. (2001) The foundations of cost-sensitive learning, in Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence (IJCAI), Seattle, WA, pp. 973–978.
Bock, J. R. and Gough, D. A. (2003) Machine learning inference of protein-protein binding in Saccharomyces cerevisiae,in review.
Goffeau, A., Barrell, B. G., Bussey, H., et al. (1996) Life with 6000 genes. Science 274, 563–567.
Chervitz, S. A., Aravind, L., Sherlock, G., Ball, C. A., Koonin, E. V., and Dwight, S. S. (1998) Comparison of the complete protein sets of worm and yeast: orthology and divergence. Science 282, 2022–2028.
Mumberg, D., Muller, R., and Funk, M. (1995) Yeast vectors for the controlled expression of heterologous proteins in different genetic backgrounds. Gene 156, 119–122.
Munder, T. and Hinnen, A. (1999) Yeast cells as tools for target-oriented screening. Appl. Microbiol. Biotechnol. 52, 311–320.
Ito, T., Chiba, T., Ozawa, R., Yoshida, M., Hattori, M., and Sakaki, Y. (2001) A comprehensive two-hydrid analysis to explore the yeast protein interactome. Proc. Natl. Acad. Sci. USA 98, 4569–4574.
Bartel, P., Chien, C. T., Sternglanz, R., and Fields, S. (1993) Elimination of false positives that arise in using the two-hybrid system. Biotechniques 14, 920–924.
Smith, T. F. and Waterman, W. S. (1981) Identification of common molecular subsequences. J. Mol. Biol. 147, 195–197.
Altschul, S. F. and Gish, W. (1996) Local alignment statistics. Methods Enzymol. 266, 460–480.
Henikoff, S. and Henikoff, J. G. (1992) Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci. USA 89, 10,915–10, 519.
Kohavi, R. and Provost, F. (1998) Glossary of terms. Machine Learning 30, 271–274.
Peterson, W. W. and Birdsall, T. G. (1953) The theory of signal detectability. Technical Report TR-13, Communications and Signal Processing Laboratory, University of Michigan, Ann Arbor, MI.
Stone, M. (1974) Cross-validatory choices and assessment of statistical predictions. J. Roy. Stat. Soc. 36, 111–147.
Skolnik, M. I. (1980) Introduction to Radar Systems, 2nd ed. McGraw-Hill, New York.
Urick, R. J. (1983) Principles of Underwater Sound, 3rd ed. McGraw-Hill, New York.
Druker, B. J., Talpaz, M. T., Resta, D. J., et al. (2001) Efficacy and safety of a specific inhibitor of the BCR-ABL tyrosine kinase in chronic myeloid leukemia and acute lymphoblastic leukemia. N. Engl. J. Med. 344, 1031–1037.
Black, D. L. (2000) Protein diversity from alternative splicing: a challenge for bioinformatics and post-genome biology. Cell 103, 367–370.
Bock, J. R. and Gough, D. A. (2003) Whole-proteome interaction mining. Bioinformatics 19 125–135.
Bradley, P. S., Fayyad, U. M., and Mangasarian, O. L. (1998) Mathematical programming for data mining: formulations and challenges. Technical Report MSR-98–01, University of Wisconsin Data Mining Institute, Madison, WI.
Rain, J. C., Selig, L., De Reuse, H., et al. (2001) The protein-protein interaction map of Helicobacter pylori. Nature 409, 211–215.
Burges, C. (1998) A tutorial on support vector machines for pattern recognition. Data Mining Knowledge Discovery 2, 121–167.
Sankoff, D. Leduc, G., Paquin, B., Lang, B. F., and Cedergren, R. (1992) Gene order comparisons of phylogenetic inference: evolution of the mitochondrial genome. Proc. Natl. Acad. Sci. USA 89 6575–6579.
Tekaia, F., Lazcano, A., and Dujon, B. (l 999) The genomic tree as revealed from whole proteome comparisons. Genome Res. 9, 550–557.
Brown, J. R. Douady, C. J., Italia, M. J. Marshall, W. E., and Stanhope, M. H. (2001) Universal trees based on large combined protein sequence data sets. Nat. Genet. 28 281–285.
Efron, B. and Gong, G. (1983) A leisurely look at the bootstrap, the jackknife, and cross-validation. Am. Stat. 37, 36–48.
Eisen, J. A. (2000) Assessing evolutionary relationships among microbes from wholegenome analysis. Curr. Opin. Microbiol. 3, 475–480.
Milo, R., Shen-Orr, S., Itzkovitz, S., Kashtan, N., Chklovskii, D., and Alon, U. (2002) Network motifs: simple building blocks of complex networks. Science 298, 824–827.
Klumpp, S. and Krieglstein, J. (2002) Phosphorylation and dephosphorylation of histidine residues in proteins. Eur. J. Biochem. 269, 1067–1071.
Alberts, B., Bray, D., Lewis, J. Raff, M., Roberts, K., and Watson, J. D. (1989) Molecular Biology of the Cell,2nd ed. New York.
Bairoch, A., Bucher, P., and Hofmann, K. (1997) The PROSITE database, its status in 1997. Nucleic Acids Res. 25, 217–221.
Matsushita, M. and Janda, K. D. (2002) Histidine kinases as targets for new antimicrobial agents. Bioorg. Med. Chem. 10, 855–867.
Andrews, S. C. (1998) Iron storage in bacteria. Adv. Microb. Physiol. 40, 281–351.
Jeong, H., Mason, S. P., Barabâsi, A.-L., and Oltvai, Z. N. (2001) Lethality and centrality in protein networks. Nature 411, 41–42.
Cunningham, M. J. (2000) Genomics and proteomics: the new millennium of drug discovery and development. J. Pharmacol. Toxicol. Methods 44, 291–300.
Bissantz, C., Folkers, G., and Rognan, D. (2000) Protein-based virtual screening of chemical databases. 1. Evaluation of different docking/scoring combinations. J. Med. Chem. 43, 4759–4767.
Waszkowycz, B. (2002) Structure-based approaches to drug design and virtual screening. Curr. Opin. Drug Discovery Dev. 5, 407–413.
Langer, T. and Hoffmann, R. D. (2001) Virtual screening: an effective tool for lead structure discovery? Curr. Pharma. Design 7, 509–527.
Gohlke, H. and Klebe, G. (2001) Statistical potentials and scoring functions applied to protein-ligand binding. Curr. Opin. Struct. Biol. 11, 231–235.
Böhm, H. J. (1998) Prediction of binding constants of protein ligands: a fast method for the prioritization of hits obtained from de novo design or 3D database search programs. J. Comput. Aided Mol. Design 12 309–323.
Moret, E. E., van Wijk, M. C., Kostense, A. S., and Gillies, M. B. (1999) Scoring peptide(mimetic)-protein interactions. Med. Chem. Res. 9, 604–620.
Bock, J. R. and Gough, D. A. (2002) A new method to estimate ligand-receptor energetics. Mol. Cell. Proteomics 1, 904–910.
Smola, A. J. and Schölkopf, B. (1998) A tutorial on support vector regression. Technical Report NC-TR-98–030, Royal Holloway College, University of London, London.
Ortiz, A. R., Pisabarro, M. T., Gago, F., and Wade, R. C. (1995) Prediction of drug binding affinities by comparative binding energy analysis. J. Med. Chem. 38, 2681–2691.
Chen, Y. Z. and Zhi, D. G. (2001) Ligand-protein inverse docking and its potential use in the computer search of protein targets of a small molecule. Proteins 43, 217–226.
Berman, H. M., Westbrook, J., Feng, Z., et al. (2000) The Protein Data Bank. Nucleic Acids Res. 28, 235–242.
Weininger, D. (1988) SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J. Chem. Inform. Comput. Sci. 28, 31–36.
Wegner, J. and Zell, A. (2002) JOELib: a Java based computational chemistry package, in 6th Darmstädter Molecular-Modelling Workshop, Technische Universität, Darmstadt, Germany.
Burden, F. R. (1989) Molecular identification number for substructure searches. J. Chem. Inform. Comput. Sci. 29, 225–227.
Boikess, R. S. and Edelson, E. (1981) Chemical Principles, 2nd ed. Harper & Row, New York.
Golub, G. H. and van Loan, C. F. (1989) Matrix Computations, 2nd ed. Johns Hopkins University Press, Baltimore, MD.
Gershenfeld, N. A. and Weigend, A. S. (1993) The Future of Time Series: Learning and Understanding, vol. XV of Sante Fe Institute Studies in the Sciences of Complexity. Addison- Wesley, Reading, MA, pp. 1–70.
Kendall, M. G. (1938) A new measure of rank correlation. Biometrika 30, 81–93.
Head, R. D., Smythe, M. L., Oprea, T. I., Waller, C. L., Green, S. M., and Marshall, G. R. (1996) VALIDATE: a new method for the receptor-based prediction of binding affinities of novel ligands. J. Amer. Chem. Soc. 118, 3959–3969.
Wang, R., Liu, L., Lai, L., and Tang, Y. (1998) SCORE: a new empirical method for estimating the binding affinity of a protein-ligand complex. J. Mol. Modeling 4, 379–394.
Schwikowski, B., Uetz, P., and Fields, S. (2000) A network of protein-protein interactions in yeast. Nat. Biotechnol. 18, 1257–1261.
Wojcik, J. and Schächter, V. (2001) Protein-protein interaction map inference using interacting domain profile pairs. Bioinformatics 17 (suppl. 1), S296 - S305.
Uetz, P., Goit, L., Cagney, G., et al. (2000) A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature 403, 623–627.
Tucker, C. L., Gera, J. F., and Uetz, P. (2001) Towards an understanding of complex protein networks. Trends Cell Biol. 11, 102–106.
Walhout, A., Boulton, S., and Vidal, M. (2000) Yeast two-hybrid systems and protein interaction mapping projects for yeast and worm. Yeast 17, 88–94.
Wang, R., Lai, L., and Wang, S. (2002) Further development and validation of empirical scoring functions for structure-based binding affinity prediction. J. Comput. Aided Mol. Design 16, 11–26.
Rarey, M., Kramer, B., Bernd, C., and Lengauer, T. (1996) Time-efficient docking of similar flexible ligands, in Biocomputing: Proceedings of the 1996 Pacific Symposium, Hunter, L. and Klein, T., eds., January 3–6, World Scientific Publishing, Singapore.
Zhang, T. and Koshland, D. E. (1996) Computational method for relative binding energies of enzyme-substrate complexes. Protein Sci. 5, 348–356.
Schapira, M., Totrov, M., and Abagyan, R. (1999) Prediction of the binding energy for small molecules, peptides and proteins. J. Mol. Recog. 12, 177–190.
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer Science+Business Media New York
About this chapter
Cite this chapter
Bock, J.R., Gough, D.A. (2003). In Silico Proteomics. In: Conn, P.M. (eds) Handbook of Proteomic Methods. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-59259-414-6_13
Download citation
DOI: https://doi.org/10.1007/978-1-59259-414-6_13
Publisher Name: Humana Press, Totowa, NJ
Print ISBN: 978-1-61737-504-0
Online ISBN: 978-1-59259-414-6
eBook Packages: Springer Book Archive