Abstract
This paper describes a system called prediction of protein subcellular localization (P2SL) that predicts the subcellular localization of proteins in eukaryotic organisms based on the amino acid content of primary sequences using amino acid order. Our approach for prediction is to find the most frequent motifs for each protein (class) based on clustering and then to use these most frequent motifs as features for classification. This approach allows a classification independent of the length of the sequence. Another important property of the approach is to provide a means to perform reverse analysis and analysis to extract rules. In addition to these and more importantly, we describe the use of a new encoding scheme for the amino acids that conserves biological function based on point of accepted mutations (PAM) substitution matrix. We present preliminary results of our system on a two class (dichotomy) classifier. However, it can be extended to multiple classes with some modifications.
This work was supported by the Turkish Academy of Sciences to R.Ç.A. (in the framework of the Young Scientist Award Program-RCA/TÜBA-GEBİP/2001-2-3).
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
van Vliet, C., Thomas, E.C., Merino-Trigo, A., Teasdale, R.D., Gleeson, P.A.: Intracellular sorting and transport of proteins. Prog. Biophys. Mol. Biol. 83(1), 1–45 (2003)
Corpet, F., Servant, F., Gouzy, J., Kahn, D.: ProDom and ProDom-CG: tools for protein domain analysis and whole genome comparisons. Nucleic Acids Res. 28, 267–269 (2000)
Dayhoff, M.O., Schwartz, R.M., Orcutt, B.C.: A model of evolutionary change in proteins. In: Atlas of protein sequence and structure, vol. 5, Suppl. 3, pp. 345–352. National Biomedical Research Foundation, Washington (1979)
Nakai, K., Kanehisa, M.: A knowledge base for predicting protein localization sites in the eukaryotic cells. Genomics 14, 897–991 (1992)
iPSORT is available at: http://hypothesiscreator.net/iPSORT
Emanuelsson, O., Nielsen, H., Brunak, S., von Heijne, G.: Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. J. Mol. Biol. 300, 1005–1016 (2000)
Claros, M.G.: MitoProt: a Macintosh application for studying mitochondrial proteins. Computer Applications in the Biosciences 11(4), 441–447 (1995)
Fujiwara, Y., Asogawa, H., Nakai, K.: Prediction of mitochondrial targeting signals using hidden Markov models. Genome Informatics 8, 53–60 (1997)
Nielsen, H., Engelbrecht, J., Brunak, S., von Heijne, G.: A neural network method for identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. International Journal of Neural Systems 8(5–6), 581–599 (1997)
Emanuelsson, O., Nielsen, H., von Heijne, G.: ChloroP, a neural network-based method for predicting chloroplast transit peptides and their cleavage sites. Protein Sci. 8, 978–984 (1999)
Fujiwara, Y., Asogawa, M.: Prediction of Subcellular Localization Using Amino Acid Composition and Order. Genome Informatics 12, 103–112 (2001)
Cai, Y., Liu, X., Chou, K.: Artificial neural network model for predicting protein subcellular location. Computers and Chemistry 26, 179–182 (2002)
Altschul, S.F.: Amino acid substitution matrices from an information theoretic perspective. J. Mol. Biol. 219, 555–565 (1991)
Kohonen, T.: The self-organizing map. Proceedings of the IEEE 78(9), 1464–1480 (1990)
The SOMPAK package is available at: http://www.cis.hut.fi/nnrc/papers/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Özarar, M., Atalay, V., Atalay, R.Ç. (2003). Prediction of Protein Subcellular Localization Based on Primary Sequence Data. In: Yazıcı, A., Şener, C. (eds) Computer and Information Sciences - ISCIS 2003. ISCIS 2003. Lecture Notes in Computer Science, vol 2869. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39737-3_76
Download citation
DOI: https://doi.org/10.1007/978-3-540-39737-3_76
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20409-1
Online ISBN: 978-3-540-39737-3
eBook Packages: Springer Book Archive