Computational Mutagenesis of E. coliLac Repressor: Insight into Structure-Function Relationships and Accurate Prediction of Mutant Activity

  • Majid Masso
  • Kahkeshan Hijazi
  • Nida Parvez
  • Iosif I. Vaisman
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4983)


A computational mutagenesis methodology that utilizes a four-body, knowledge-based, statistical contact potential is applied toward quantifying relative changes (residual scores) to sequence-structure compatibility in E. coli lac repressor due to single amino acid residue substitutions. We show that these residual scores correlate well with experimentally measured relative changes in protein activity caused by the mutations. The approach also yields a measure of environmental perturbation at every residue position in the protein caused by the mutation (residual profile). Supervised learning with a decision tree algorithm, utilizing the residual profiles of over 4000 experimentally evaluated mutants for training, classifies the mutants based on activity with nearly 79% accuracy while achieving 0.80 area under the receiver operating characteristic curve. A trained decision tree model is subsequently used to infer the levels of activity for all remaining unexplored lac repressor mutants.


lac repressor Delaunay tessellation statistical potential computational mutagenesis supervised learning 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bell, C.E., Lewis, M.: The Lac repressor: A second generation of structural and functional studies. Curr. Opin. Struct. Biol. 11, 19–25 (2001)CrossRefGoogle Scholar
  2. 2.
    Matthews, K.S.: The whole lactose repressor. Science 271, 1245–1246 (1996)CrossRefGoogle Scholar
  3. 3.
    Muller-Hill, B.: Some repressors of bacterial transcription. Curr. Opin. Microbiol. 1, 145–151 (1998)CrossRefGoogle Scholar
  4. 4.
    Pace, H.C., Kercher, M.A., Lu, P., Markiewicz, P., Miller, J.H., Chang, G., Lewis, M.: Lac repressor genetic map in real space. Trends Biochem. Sci. 22, 334–339 (1997)CrossRefGoogle Scholar
  5. 5.
    Lewis, M.: The lac repressor. C.R. Biol. 328, 521–548 (2005)CrossRefGoogle Scholar
  6. 6.
    Muller-Hill, B.: Suppressible regulator constitutive mutants of the lactose system in Escherichia coli. J. Mol. Biol. 15, 374–376 (1966)Google Scholar
  7. 7.
    Muller, J., Barker, A., Oehler, S., Muller-Hill, B.: Dimeric lac repressors exhibit phase-dependent co-operativity. J. Mol. Biol. 284, 851–857 (1998)CrossRefGoogle Scholar
  8. 8.
    Pfahl, M., Stockter, C., Gronenborn, B.: Genetic analysis of the active sites of lac repressor. Genetics 76, 669–679 (1974)Google Scholar
  9. 9.
    Platt, T., Files, J.G., Weber, K.: Lac repressor. Specific proteolytic destruction of the NH 2 -terminal region and loss of the deoxyribonucleic acid-binding activity. J. Biol. Chem. 248, 110–121 (1973)Google Scholar
  10. 10.
    Schmitz, A., Schmeissner, U., Miller, J.H.: Mutations affecting the quaternary structure of the lac repressor. J. Biol. Chem. 251, 3359–3366 (1976)Google Scholar
  11. 11.
    Alberti, S., Oehler, S., von Bergmann, B., Kramer, H., Muller-Hill, B.: Dimer-to-tetramer assembly of Lac repressor involves a leucine heptad repeat. New Biol. 3, 57–62 (1991)Google Scholar
  12. 12.
    Alberti, S., Oehler, S., von Bergmann, B., Muller-Hill, B.: Genetic analysis of the leucine heptad repeats of Lac repressor. Embo. J. 12, 3227–3236 (1993)Google Scholar
  13. 13.
    Suckow, J., Markiewicz, P., Kleina, L.G., Miller, J., Kisters-Woike, B., Muller-Hill, B.: Genetic studies of the Lac repressor. XV. J. Mol. Biol. 261, 509–523 (1996)CrossRefGoogle Scholar
  14. 14.
    Markiewicz, P., Kleina, L.G., Cruz, C., Ehret, S., Miller, J.H.: Genetic studies of the lac repressor XIV. J. Mol. Biol. 240, 421–433 (1994)CrossRefGoogle Scholar
  15. 15.
    Kleina, L.G., Miller, J.H.: Genetic studies of the lac repressor XIII. J. Mol. Biol. 212, 295–318 (1990)CrossRefGoogle Scholar
  16. 16.
    Vaisman, I.I., Tropsha, A., Zheng, W.: Compositional preferences in quadruplets of nearest neighbor residues in protein structures: Statistical geometry analysis. In: Proceedings of the IEEE Symposia on Intelligence and Systems, pp. 163–168 (1998)Google Scholar
  17. 17.
    Singh, R.K., Tropsha, A., Vaisman, I.I.: Delaunay tessellation of proteins: Four body nearest-neighbor propensities of amino acid residues. J. Comput. Biol. 3, 213–221 (1996)CrossRefGoogle Scholar
  18. 18.
    Masso, M., Lu, Z., Vaisman, I.I.: Computational mutagenesis studies of protein structure-function correlations. Proteins 64, 234–245 (2006)CrossRefGoogle Scholar
  19. 19.
    Verzilli, C.J., Whittaker, J.C., Stallard, N., Chasman, D.: A hierarchical Bayesian model for predicting the functional consequences of amino acid polymorphisms. Applied Statistics 54, 191–206 (2005)zbMATHMathSciNetGoogle Scholar
  20. 20.
    Krishnan, V.G., Westhead, D.R.: A comparative study of machine-learning methods to predict the effects of single nucleotide polymorphisms on protein function. Bioinformatics 19, 2199–2209 (2003)CrossRefGoogle Scholar
  21. 21.
    Karchin, R., Kelly, L., Sali, A.: Improving functional annotation of non-synonomous SNPs with information theory. Pac. Symp. Biocomput., 397–408 (2005)Google Scholar
  22. 22.
    Ng, P.C., Henikoff, S.: SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Res. 31, 3812–3814 (2003)CrossRefGoogle Scholar
  23. 23.
    Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., Bourne, P.E.: The Protein Data Bank. Nucleic Acids Res. 28, 235–242 (2000)CrossRefGoogle Scholar
  24. 24.
    Barber, C.B., Dobkin, D.P., Huhdanpaa, H.T.: The quickhull algorithm for convex hulls. ACM Transactions on Mathematical Software 22, 469–483 (1996)zbMATHCrossRefMathSciNetGoogle Scholar
  25. 25.
    Bell, C.E., Lewis, M.: A closer view of the conformation of the Lac repressor bound to operator. Nat. Struct. Biol. 7, 209–214 (2000)CrossRefGoogle Scholar
  26. 26.
    Masso, M., Vaisman, I.I.: Comprehensive mutagenesis of HIV-1 protease: A computational geometry approach. Biochem. Biophys. Res. Commun. 305, 322–326 (2003)CrossRefGoogle Scholar
  27. 27.
    Quinlan, R.: C4.5: Programs for Machine Learning, San Mateo, CA. Morgan Kaufman Publishers, San Francisco (1993)Google Scholar
  28. 28.
    Frank, E., Hall, M., Trigg, L., Holmes, G., Witten, I.H.: Data mining in bioinformatics using Weka. Bioinformatics 20, 2479–2481 (2004)CrossRefGoogle Scholar
  29. 29.
    Fawcett, T.: ROC graphs: Notes and practical considerations for researchers. HPL-2003-4. Hewlett-Packard Labs, Palo Alto (2003) Google Scholar
  30. 30.
    Hanley, J.A., McNeil, B.J.: The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143, 29–36 (1982)Google Scholar
  31. 31.
    Provost, F., Domingos, P.: Well-trained PETs. CeDER Technical Report IS-00-04. Stern School of Business, New York University, New York (2001) Google Scholar
  32. 32.
    Dayhoff, M.O., Schwartz, R.M., Orcut, B.C. (eds.): A model for evolutionary change in proteins, Washington D.C. National Biomedical Research Foundation, vol. 5 (1978)Google Scholar
  33. 33.
    Chasman, D., Adams, R.M.: Predicting the functional consequences of non-synonymous single nucleotide polymorphisms: Structure-based assessment of amino acid variation. J. Mol. Biol. 307, 683–706 (2001)CrossRefGoogle Scholar
  34. 34.
    Wrobel, J.A., Chao, S.F., Conrad, M.J., Merker, J.D., Swanstrom, R., Pielak, G.J., Hutchison, C.A.: A genetic approach for identifying critical residues in the fingers and palm subdomains of HIV-1 reverse transcriptase. Proc. Natl. Acad. Sci. U.S.A. 95, 638–645 (1998)CrossRefGoogle Scholar
  35. 35.
    Kyte, J., Doolittle, R.F.: A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 157, 105–132 (1982)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Majid Masso
    • 1
  • Kahkeshan Hijazi
    • 1
  • Nida Parvez
    • 1
  • Iosif I. Vaisman
    • 1
  1. 1.Laboratory for Structural BioinformaticsGeorge Mason UniversityManassasUSA

Personalised recommendations