Using Predictive Models to Engineer Biology: A Case Study in Codon Optimization

  • Alexey A. Gritsenko
  • Marcel J. T. Reinders
  • Dick de Ridder
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7986)


Given recent advances in synthetic biology and DNA synthesis, there is an increasing need for carefully engineered biological parts (e.g. genes, promoter sequences or enzymes) and circuits. However, forward engineering approaches are thus far rarely used in biology due to lack of detailed knowledge of the biological mechanisms. We describe a framework that enables forward engineering in biology by constructing models predictive of properties of interest, then inverting and using these models to design biological parts.

We demonstrate the applicability of the proposed framework on the problem of codon optimization, concerned with optimizing gene coding sequences for efficient translation. Results suggest that our data-driven codon optimization (DECODON) method simultaneously considers the effects multiple translation mechanisms to produce optimal sequences, in contrast to existing codon optimization techniques.


synthetic biology codon optimization support vector regression genetic algorithms 


  1. 1.
    Angov, E.: Codon usage: Nature’s roadmap to expression and folding of proteins. Biotechnology Journal 6(6), 650–659 (2011)CrossRefGoogle Scholar
  2. 2.
    Cannarozzi, G., Schraudolph, N.N., Faty, M., von Rohr, P., Friberg, M.T., Roth, A.C., Gonnet, P., Gonnet, G., Barral, Y.: A role for codon order in translation dynamics. Cell 141, 355–367 (2010)CrossRefGoogle Scholar
  3. 3.
    Cannarozzi, G.M., Schneider, A.: Codon evolution: mechanisms and models. OUP Oxford (2012)Google Scholar
  4. 4.
    Chang, C.-C., Lin, C.-J.: LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, 1–27 (2011)CrossRefGoogle Scholar
  5. 5.
    Coleman, J.R., Papamichail, D., Skiena, S., Futcher, B., Wimmer, E., Mueller, S.: Virus attenuation by genome-scale changes in codon pair bias. Science 320(5884), 1784–1787 (2008)CrossRefGoogle Scholar
  6. 6.
    Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation 6(2), 182–197 (2002)CrossRefGoogle Scholar
  7. 7.
    Drucker, H., Burges, C.J.C., Kaufman, L., Smola, A., Vapnik, V.: Support vector regression machines. In: Advances in Neural Information Processing Systems, pp. 155–161 (1997)Google Scholar
  8. 8.
    Fredrick, K., Ibba, M.: How the sequence of a gene can tune its translation. Cell 141(2), 227–229 (2010)CrossRefGoogle Scholar
  9. 9.
    Grote, A., Hiller, K., Scheer, M., Münch, R., Nörtemann, B., Hempel, D.C., Jahn, D.: JCat: a novel tool to adapt codon usage of a target gene to its potential expression host. Nucleic Acids Research 33(suppl. 2), 526–531 (2005)Google Scholar
  10. 10.
    Hofacker, I.L., Fontana, W., Stadler, P.F., Bonhoeffer, L.S., Tacker, M., Schuster, P.: Fast folding and comparison of rna secondary structures. Monatshefte für Chemie/Chemical Monthly 125(2), 167–188 (1994)CrossRefGoogle Scholar
  11. 11.
    Ingolia, N.T., Ghaemmaghami, S.A., Newman, J.R.S., Weissman, J.S.: Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science 324(5924), 218–223 (2009)CrossRefGoogle Scholar
  12. 12.
    Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artificial Intelligence 97(1), 273–324 (1997)zbMATHCrossRefGoogle Scholar
  13. 13.
    Koopman, F., Beekwilder, J., Crimi, B., van Houwelingen, A., Hall, R.D., Bosch, D., van Maris, A.J.A., Pronk, J.T., Daran, J.-M.: De novo production of the flavonoid naringenin in engineered Saccharomyces cerevisiae. Microbial Cell Factories 11(1), 155 (2012)CrossRefGoogle Scholar
  14. 14.
    Lu, P., Vogel, C., Wang, R., Yao, X., Marcotte, E.M.: Absolute protein expression profiling estimates the relative contributions of transcriptional and translational regulation. Nature Biotechnology 25(1), 117–124 (2006)CrossRefGoogle Scholar
  15. 15.
    Maertens, B., Spriestersbach, A., von Groll, U., Roth, U., Kubicek, J., Gerrits, M., Graf, M., Liss, M., Daubert, D., Wagner, R., et al.: Gene optimization mechanisms: A multi-gene study reveals a high success rate of full-length human proteins expressed in Escherichia coli. Protein Science 19(7), 1312–1326 (2010)CrossRefGoogle Scholar
  16. 16.
    Mohammadi, B., Pironneau, O.: Shape optimization in fluid mechanics. Annu. Rev. Fluid Mech. 36, 255–279 (2004)MathSciNetCrossRefGoogle Scholar
  17. 17.
    Nagalakshmi, U., Wang, Z., Waern, K., Shou, C., Raha, D., Gerstein, M., Snyder, M.: The transcriptional landscape of the yeast genome defined by rna sequencing. Science 320(5881), 1344–1349 (2008)CrossRefGoogle Scholar
  18. 18.
    Qian, W., Yang, J.R., Pearson, N.M., Maclean, C., Zhang, J.: Balanced codon usage optimizes eukaryotic translational efficiency. PLoS Genetics, 8(3), e1002603 (2012)Google Scholar
  19. 19.
    Tuller, T., Veksler-Lublinsky, I., Gazit, N., Kupiec, M., Ruppin, E., Ziv-Ukelson, M.: Composite effects of gene determinants on the translation speed and density of ribosomes. Genome Biology 12(11), R110 (2011)Google Scholar
  20. 20.
    Wessels, L.F.A., Reinders, M.J.T., Hart, A.A.M., Veenman, C.J., Dai, H., He, Y.D., Van’t Veer, L.J.: A protocol for building and evaluating predictors of disease state based on microarray data. Bioinformatics 21(19), 3755–3762 (2005)CrossRefGoogle Scholar
  21. 21.
    Yassour, M., Kaplan, T., Fraser, H.B., Levin, J.Z., Pfiffner, J., Adiconis, X., Schroth, G., Luo, S., Khrebtukova, I., Gnirke, A., Nusbaum, C., Thompson, D.-A., Friedman, N., Regev, A.: Ab initio construction of a eukaryotic transcriptome by massively parallel mRNA sequencing. Proceedings of the National Academy of Sciences 106(9), 3264–3269 (2009)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Alexey A. Gritsenko
    • 1
    • 2
    • 3
  • Marcel J. T. Reinders
    • 1
    • 2
    • 3
  • Dick de Ridder
    • 1
    • 2
    • 3
  1. 1.The Delft Bioinformatics Lab, Department of Intelligent SystemsDelft University of TechnologyDelftThe Netherlands
  2. 2.Platform Green Synthetic BiologyDelftThe Netherlands
  3. 3.Kluyver Centre for Genomics of Industrial FermentationDelftThe Netherlands

Personalised recommendations