Abstract
We describe a statistical model that uses binomial logistic regression for predicting the solubility of heterologous proteins expressed in E. coli. The model is based on a set of proteins reported to have been expressed in E. coli in either soluble or insoluble form. The 22 parameters used in the final model based on proteins’ amino acid composition are discussed. The overall accuracy of the model developed is 94 %. The way to use this model on the website http://www.ou.edu/ for the prediction of protein solubility is explained.
Key words
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Diaz AA, Tomba E, Lennarson R et al (2009) Prediction of protein solubility in Escherichia coli using logistic regression. Biotechnol Bioeng 105:374–383
Davis GD, Elisee C, Newham DM et al (1999) New fusion protein systems designed to give soluble expression in Escherichia coli. Biotechnol Bioeng 65:382–388
Walter S, Buchner J (2002) Molecular chaperones—cellular machines for protein folding. Angew Chem Int Ed Engl 41:1098–1113
Schein CH, Noteborn MHM (1988) Formation of soluble recombinant proteins in Escherichia coli is favored by lower growth temperature. Bio/Technology 6:291–294
Baneyx F, Mujacic M (2004) Recombinant protein folding and misfolding in Escherichia coli. Nat Biotechnol 22:1399–1408
Hosmer DW, Lemeshow S (2000) Applied logistic regression. Wiley, New York
Idicula-Thomas S, Balaji PV (2005) Understanding the relationship between the primary structure of proteins and its propensity to be soluble on overexpression in Escherichia coli. Protein Sci 14:582–592
Hopp TP, Woods KR (1981) Prediction of protein antigenic determinants from amino acid sequences. Proc Natl Acad Sci U S A 78:3824–3828
Acknowledgment
We thank graduate student Armando Diaz and undergraduate students Emanuele Tomba, Reese Lennarson, and Rex Richard for their help in developing the logistic regression model; undergraduate students Dolores Gutierrez-Cacciabue, Nathan Liles, and Zehra Tosun for their help in developing the protein database; and undergraduate student Andrew Lambeth for developing the website for the model.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer Science+Business Media New York
About this protocol
Cite this protocol
Harrison, R.G., Bagajewicz, M.J. (2015). Predicting the Solubility of Recombinant Proteins in Escherichia coli . In: García-Fruitós, E. (eds) Insoluble Proteins. Methods in Molecular Biology, vol 1258. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-2205-5_23
Download citation
DOI: https://doi.org/10.1007/978-1-4939-2205-5_23
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-2204-8
Online ISBN: 978-1-4939-2205-5
eBook Packages: Springer Protocols