A New Hybrid Approach to Predict Subcellular Localization by Incorporating Protein Evolutionary Conservation Information

Zhang, ShaoWu; Zhang, YunLong; Li, JunHui; Yang, HuiFeng; Cheng, YongMei; Zhou, GuoPing

doi:10.1007/978-3-540-74771-0_20

ShaoWu Zhang¹,
YunLong Zhang²,
JunHui Li¹,
HuiFeng Yang¹,
YongMei Cheng¹ &
…
GuoPing Zhou³

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 4689))

Included in the following conference series:

International Conference on Life System Modeling and Simulation

1595 Accesses

Abstract

The rapidly increasing number of sequence entering into the genome databank has created the need for fully automated methods to analyze them. Knowing the cellular location of a protein is a key step towards understanding its function. The development in statistical prediction of protein attributes generally consists of two cores: one is to construct a training dataset and the other is to formulate a predictive algorithm. The latter can be further separated into two subcores: one is how to give a mathematical expression to effectively represent a protein and the other is how to find a powerful algorithm to accurately perform the prediction. Here, an improved evolutionary conservation algorithm was proposed to calculate per residue conservation score. Then, each protein can be represented as a feature vector created with multi-scale energy (MSE). In addition, the protein can be represented as other feature vectors based on amino acid composition (AAC), weighted auto-correlation function and Moment descriptor methods. Finally, a novel hybrid approach was developed by fusing the four kinds of feature classifiers through a product rule system to predict 12 subcellular locations. Compared with existing methods, this new approach provides better predictive performance. High success accuracies were obtained in both jackknife cross-validation test and independent dataset test, suggesting that introducing protein evolutionary information and the concept of fusing multi-features classifiers are quite promising, and might also hold a great potential as a useful vehicle for the other areas of molecular biology.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Chou, K.C.: Review: Structural bioinformatics and its impact to biomedical science. Curr. Med. Chem. 11, 2105–2134 (2004)
Google Scholar
Lubec, G., Afjehi-Sadat, L., Yang, J.W., John, J.P.: Searching for hypothetical proteins: theory and practice based upon original data and literature. Prog. Neurobiol. 77, 90–127 (2005)
Article Google Scholar
Chou, K.C., Elrod, D.W.: Protein subcellular location prediction. Protein Engineering 12, 107–118 (1999)
Article Google Scholar
Chou, K.C.: Prediction of protein subcellular locations by incorporating quasi-sequence-order effect. Biochem. Biophys. Research Commun. 278, 477–483 (2000)
Article Google Scholar
Chou, K.C.: Prediction of protein cellular attributes using pseudo-amino acid composition. Proteins: Structure, Function, and Genetics 43, 246–255 (2001)
Article Google Scholar
Pan, Y.X., Zhang, Z.Z., Guo, Z.M., Feng, G.Y., Huang, Z.D., He, L.: Application of pseudo amino acid composition for predicting protein subcellular location: stochastic signal processing approach. J. Protein Chem. 22, 395–402 (2003)
Article Google Scholar
Zhou, G.P., Doctor, K.: Subcellular location prediction of apoptosis proteins. PROTEINS: Struct. Funct. Genet. 50, 44–48 (2003)
Article Google Scholar
Park, K.J., Kanehisa, M.: Prediction of protein subcellular locations by support vector machines using compositions of amino acid and amino acid pairs. Bioinformatics 19, 1656–1663 (2003)
Article Google Scholar
Gao, Y., Shao, S., Xiao, X., Ding, Y., Huang, Y., Huang, Z., Chou, K.C.: Using pseudo amino acid composition to predict protein subcellular location: Approached with Lyapunov index, Bessel function, and Chebyshev filter. Amino Acid 28, 373–376 (2005)
Article Google Scholar
Xia, X., Shao, S., Ding, Y., Huang, Z., Huang, Y., Chou, K.C.: Using complexity measure factor to predict protein subcellular location. Amino Acid 28, 57–81 (2005)
Article Google Scholar
Xia, X., Shao, S., Ding, Y., Huang, Z., Huang, Y., Chou, K.C.: Using cellular automata images and pseudo amino acid composition to predict protein subcellular location. Amino Acid 30, 49–54 (2006)
Article Google Scholar
Shi, J.Y., Zhang, S.W., Liang, Y., Pan, Q.: Prediction of Protein Subcellular Localizations Using Moment Descriptors and Support Vector Machine. In: PRIB 2006, Hong Kong,China, pp. 105–114. Springer, Heidelberg (2006)
Google Scholar
Shi, J.Y., Zhang, S.W., Pan, Q., Cheng, Y.M., Xie, J.: SVM-based Method for Subcellular Localization of Protein Using Multi-Scale Energy and Pseudo Amino Acid Composition. Amino Acid (2007) DOI 10.1007/s00726-006-0475-y
Google Scholar
Zhang, S.W., Pan, Q., Zhang, H.C., Shao, Z.C., Shi, J.Y.: Prediction Protein Homo-oligomer Types by Pesudo Amino Acid Composition: Approached with an Improved Feature Extraction and Naive Bayes Feature Fusion. Amino Acid 30, 461–468 (2006)
Article Google Scholar
Kittler, J., Hatef, M., Duin, R.P.W., Matas, J.: On Combining Classifiers. IEEE Trans. Pattern Analysis and Machine Intelligence 20, 226–239 (1998)
Article Google Scholar
Lichtarge, O., Bourne, H., Cohen, F.: An evolutionary trace method defines binding surfaces common to protein families. J. Mol. Biol. 257, 342–358 (1996)
Article Google Scholar
Valdar, W.S.: Scoring residue conservation. Proteins 48, 227–241 (2002)
Article Google Scholar
Soyer, O.S., Goldstein, R.A.: Predicting functional sites in proteins: Site-specific evolutionary models and their application to neurotransmitter transporters. J. Mol. Biol. 339, 227–242 (2004)
Article Google Scholar
Mihalek, I., Reš, I., Lichtarge, O.: A Family of Evolution–Entropy Hybrid Methods for Ranking Protein Residues by Importance. J. Mol. Biol. 336, 1265–1282 (2004)
Article Google Scholar
Altschul, S., Madden, T., Schffer, A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.: Gapped blast and psi-blast: a new generation of protein database search programs. Nucleic Acids Research 25, 3389–3402 (1997)
Article Google Scholar
UniProt (2005), http://www.expasy.org/
Thompson, J., Higgins, D., Gibson, T.: Clustal w: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Research 22, 4673–4680 (1994)
Article Google Scholar
Pittner, S., Kamarthi, S.V.: Feature extraction from wavelet coeffi-cients for pattern recognition tasks. IEEE Trans. Pattern Anal. Mach. Intell. 21, 83–88 (1999)
Article Google Scholar
Zhou, G.P.: An intriguing controversy over protein structural class prediction. J. Protein Chem. 17, 729–738 (1998)
Article Google Scholar
Zhou, G.P., Assa-Munt, N.: Some insights into protein structural class prediction. Proteins: Structure, Function, and Genetics 44, 57–59 (2001)
Article Google Scholar
Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

College of Automation, Northwestern Polytechnical University, Xi’an, 710072, China
ShaoWu Zhang, JunHui Li, HuiFeng Yang & YongMei Cheng
Department of Computer, First Aeronautical Institute of Air Force, Henan, 464000, China
YunLong Zhang
Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, Massachusetts 02115, USA
GuoPing Zhou

Authors

ShaoWu Zhang
View author publications
You can also search for this author in PubMed Google Scholar
YunLong Zhang
View author publications
You can also search for this author in PubMed Google Scholar
JunHui Li
View author publications
You can also search for this author in PubMed Google Scholar
HuiFeng Yang
View author publications
You can also search for this author in PubMed Google Scholar
YongMei Cheng
View author publications
You can also search for this author in PubMed Google Scholar
GuoPing Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Kang Li Xin Li George William Irwin Gusen He

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, S., Zhang, Y., Li, J., Yang, H., Cheng, Y., Zhou, G. (2007). A New Hybrid Approach to Predict Subcellular Localization by Incorporating Protein Evolutionary Conservation Information. In: Li, K., Li, X., Irwin, G.W., He, G. (eds) Life System Modeling and Simulation. LSMS 2007. Lecture Notes in Computer Science(), vol 4689. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74771-0_20

Download citation

DOI: https://doi.org/10.1007/978-3-540-74771-0_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74770-3
Online ISBN: 978-3-540-74771-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics