Skip to main content

A New Hybrid Approach to Predict Subcellular Localization by Incorporating Protein Evolutionary Conservation Information

  • Conference paper
Life System Modeling and Simulation (LSMS 2007)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 4689))

Included in the following conference series:

  • 1595 Accesses

Abstract

The rapidly increasing number of sequence entering into the genome databank has created the need for fully automated methods to analyze them. Knowing the cellular location of a protein is a key step towards understanding its function. The development in statistical prediction of protein attributes generally consists of two cores: one is to construct a training dataset and the other is to formulate a predictive algorithm. The latter can be further separated into two subcores: one is how to give a mathematical expression to effectively represent a protein and the other is how to find a powerful algorithm to accurately perform the prediction. Here, an improved evolutionary conservation algorithm was proposed to calculate per residue conservation score. Then, each protein can be represented as a feature vector created with multi-scale energy (MSE). In addition, the protein can be represented as other feature vectors based on amino acid composition (AAC), weighted auto-correlation function and Moment descriptor methods. Finally, a novel hybrid approach was developed by fusing the four kinds of feature classifiers through a product rule system to predict 12 subcellular locations. Compared with existing methods, this new approach provides better predictive performance. High success accuracies were obtained in both jackknife cross-validation test and independent dataset test, suggesting that introducing protein evolutionary information and the concept of fusing multi-features classifiers are quite promising, and might also hold a great potential as a useful vehicle for the other areas of molecular biology.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Chou, K.C.: Review: Structural bioinformatics and its impact to biomedical science. Curr. Med. Chem. 11, 2105–2134 (2004)

    Google Scholar 

  2. Lubec, G., Afjehi-Sadat, L., Yang, J.W., John, J.P.: Searching for hypothetical proteins: theory and practice based upon original data and literature. Prog. Neurobiol. 77, 90–127 (2005)

    Article  Google Scholar 

  3. Chou, K.C., Elrod, D.W.: Protein subcellular location prediction. Protein Engineering 12, 107–118 (1999)

    Article  Google Scholar 

  4. Chou, K.C.: Prediction of protein subcellular locations by incorporating quasi-sequence-order effect. Biochem. Biophys. Research Commun. 278, 477–483 (2000)

    Article  Google Scholar 

  5. Chou, K.C.: Prediction of protein cellular attributes using pseudo-amino acid composition. Proteins: Structure, Function, and Genetics 43, 246–255 (2001)

    Article  Google Scholar 

  6. Pan, Y.X., Zhang, Z.Z., Guo, Z.M., Feng, G.Y., Huang, Z.D., He, L.: Application of pseudo amino acid composition for predicting protein subcellular location: stochastic signal processing approach. J. Protein Chem. 22, 395–402 (2003)

    Article  Google Scholar 

  7. Zhou, G.P., Doctor, K.: Subcellular location prediction of apoptosis proteins. PROTEINS: Struct. Funct. Genet. 50, 44–48 (2003)

    Article  Google Scholar 

  8. Park, K.J., Kanehisa, M.: Prediction of protein subcellular locations by support vector machines using compositions of amino acid and amino acid pairs. Bioinformatics 19, 1656–1663 (2003)

    Article  Google Scholar 

  9. Gao, Y., Shao, S., Xiao, X., Ding, Y., Huang, Y., Huang, Z., Chou, K.C.: Using pseudo amino acid composition to predict protein subcellular location: Approached with Lyapunov index, Bessel function, and Chebyshev filter. Amino Acid 28, 373–376 (2005)

    Article  Google Scholar 

  10. Xia, X., Shao, S., Ding, Y., Huang, Z., Huang, Y., Chou, K.C.: Using complexity measure factor to predict protein subcellular location. Amino Acid 28, 57–81 (2005)

    Article  Google Scholar 

  11. Xia, X., Shao, S., Ding, Y., Huang, Z., Huang, Y., Chou, K.C.: Using cellular automata images and pseudo amino acid composition to predict protein subcellular location. Amino Acid 30, 49–54 (2006)

    Article  Google Scholar 

  12. Shi, J.Y., Zhang, S.W., Liang, Y., Pan, Q.: Prediction of Protein Subcellular Localizations Using Moment Descriptors and Support Vector Machine. In: PRIB 2006, Hong Kong,China, pp. 105–114. Springer, Heidelberg (2006)

    Google Scholar 

  13. Shi, J.Y., Zhang, S.W., Pan, Q., Cheng, Y.M., Xie, J.: SVM-based Method for Subcellular Localization of Protein Using Multi-Scale Energy and Pseudo Amino Acid Composition. Amino Acid (2007) DOI 10.1007/s00726-006-0475-y

    Google Scholar 

  14. Zhang, S.W., Pan, Q., Zhang, H.C., Shao, Z.C., Shi, J.Y.: Prediction Protein Homo-oligomer Types by Pesudo Amino Acid Composition: Approached with an Improved Feature Extraction and Naive Bayes Feature Fusion. Amino Acid 30, 461–468 (2006)

    Article  Google Scholar 

  15. Kittler, J., Hatef, M., Duin, R.P.W., Matas, J.: On Combining Classifiers. IEEE Trans. Pattern Analysis and Machine Intelligence 20, 226–239 (1998)

    Article  Google Scholar 

  16. Lichtarge, O., Bourne, H., Cohen, F.: An evolutionary trace method defines binding surfaces common to protein families. J. Mol. Biol. 257, 342–358 (1996)

    Article  Google Scholar 

  17. Valdar, W.S.: Scoring residue conservation. Proteins 48, 227–241 (2002)

    Article  Google Scholar 

  18. Soyer, O.S., Goldstein, R.A.: Predicting functional sites in proteins: Site-specific evolutionary models and their application to neurotransmitter transporters. J. Mol. Biol. 339, 227–242 (2004)

    Article  Google Scholar 

  19. Mihalek, I., Reš, I., Lichtarge, O.: A Family of Evolution–Entropy Hybrid Methods for Ranking Protein Residues by Importance. J. Mol. Biol. 336, 1265–1282 (2004)

    Article  Google Scholar 

  20. Altschul, S., Madden, T., Schffer, A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.: Gapped blast and psi-blast: a new generation of protein database search programs. Nucleic Acids Research 25, 3389–3402 (1997)

    Article  Google Scholar 

  21. UniProt (2005), http://www.expasy.org/

  22. Thompson, J., Higgins, D., Gibson, T.: Clustal w: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Research 22, 4673–4680 (1994)

    Article  Google Scholar 

  23. Pittner, S., Kamarthi, S.V.: Feature extraction from wavelet coeffi-cients for pattern recognition tasks. IEEE Trans. Pattern Anal. Mach. Intell. 21, 83–88 (1999)

    Article  Google Scholar 

  24. Zhou, G.P.: An intriguing controversy over protein structural class prediction. J. Protein Chem. 17, 729–738 (1998)

    Article  Google Scholar 

  25. Zhou, G.P., Assa-Munt, N.: Some insights into protein structural class prediction. Proteins: Structure, Function, and Genetics 44, 57–59 (2001)

    Article  Google Scholar 

  26. Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Kang Li Xin Li George William Irwin Gusen He

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhang, S., Zhang, Y., Li, J., Yang, H., Cheng, Y., Zhou, G. (2007). A New Hybrid Approach to Predict Subcellular Localization by Incorporating Protein Evolutionary Conservation Information. In: Li, K., Li, X., Irwin, G.W., He, G. (eds) Life System Modeling and Simulation. LSMS 2007. Lecture Notes in Computer Science(), vol 4689. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74771-0_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74771-0_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74770-3

  • Online ISBN: 978-3-540-74771-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics