Abstract
The cell-factory Aspergillus niger is widely used for industrial enzyme production. To select potential proteins for large-scale production, we developed a sequence-based classifier that predicts if an over-expressed homologous protein will successfully be produced and secreted. A dataset of 638 proteins was used to train and validate a classifier, using a 10-fold cross-validation protocol. Using a linear discriminant classifier, an average accuracy of 0.85 was achieved. Feature selection results indicate what features are mostly defining for successful protein production, which could be an interesting lead to couple sequence characteristics to biological processes involved in protein production and secretion
Chapter PDF
Similar content being viewed by others
References
Benita, Y., Wise, M., Lok, M., Humphery-Smith, I., Oosting, R.: Analysis of high throughput protein expression in Escherichia coli. Mol. Cell. Proteomics 5(9), 1567 (2006)
Chang, C., Lin, C.: LIBSVM: a library for support vector machines (2001)
Duin, R., Juszczak, P., Paclik, P., Pekalska, E., de Ridder, D., Tax, D., Verzakov, S.: A Matlab toolbox for pattern recognition. PRTools version 4.1, 3 (2000)
Horton, P., Park, K., Obayashi, T., Fujita, N., Harada, H., Adams-Collier, C., Nakai, K.: WoLF PSORT: protein localization predictor. Nucleic Acids Res. 35(Web Server issue), W585–W587 (2007)
Klee, E., Sosa, C.: Computational classification of classically secreted proteins. Drug Discovery Today 12(5-6), 234–240 (2007)
Kurgan, L., Razib, A., Aghakhani, S., Dick, S., Mizianty, M., Jahandideh, S.: CRYSTALP2: sequence-based protein crystallization propensity prediction. BMC Struct. Biol. 9, 50 (2009)
Kyte, J., Doolittle, R.: A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 157(1), 105–132 (1982)
Magnan, C., Randall, A., Baldi, P.: SOLpro: accurate sequence-based prediction of protein solubility. Bioinformatics 25(17), 2200–2207 (2009)
Matthews, B.: Comparison of the predicted and observed secondary structure of t4 phage lysozyme. BBA-Protein Struct. 405(2), 442–451 (1975)
Mitra, N., Sinha, S., Ramya, T., Surolia, A.: N-linked oligosaccharides as outfitters for glycoprotein folding, form and function. Trends Biochem. Sci. 31(3), 156–163 (2006)
Nevalainen, K., Te’o, V., Bergquist, P.: Heterologous protein expression in filamentous fungi. Trends Biotechnol. 23(9), 468–474 (2005)
Nielsen, H., Engelbrecht, J., Brunak, S., Von Heijne, G.: Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Protein Eng., Des. Sel. 10(1), 1 (1997)
Pel, H., de Winde, J., Archer, D., Dyer, P., Hofmann, G., Schaap, P., Turner, G., de Vries, R., Albang, R., Albermann, K., et al.: Genome sequencing and analysis of the versatile cell factory Aspergillus niger CBS 513.88. Nat. Biotechnol. 25(2), 221–231 (2007)
Pierleoni, A., Martelli, P., Fariselli, P., Casadio, R.: BaCelLo: a balanced subcellular localization predictor. Bioinformatics 22(14), e408–e416 (2006)
Sharp, P.M., Li, W.H.: The codon adaptation index - a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Res. 15(3), 1281 (1987)
Tsang, A., Butler, G., Powlowski, J., Panisko, E., Baker, S.: Analytical and computational approaches to define the Aspergillus niger secretome. Fungal Genet. Biol. 46(1), S153 (2009)
Wessels, L., Reinders, M., Hart, A., Veenman, C., Dai, H., He, Y., van’t Veer, L.: A protocol for building and evaluating predictors of disease state based on microarray data. Bioinformatics 21(19), 3755–3762 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
van den Berg, B.A. et al. (2010). Sequence-Based Prediction of Protein Secretion Success in Aspergillus niger . In: Dijkstra, T.M.H., Tsivtsivadze, E., Marchiori, E., Heskes, T. (eds) Pattern Recognition in Bioinformatics. PRIB 2010. Lecture Notes in Computer Science(), vol 6282. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16001-1_1
Download citation
DOI: https://doi.org/10.1007/978-3-642-16001-1_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16000-4
Online ISBN: 978-3-642-16001-1
eBook Packages: Computer ScienceComputer Science (R0)