Protein Fragment Swapping: A Method for Asymmetric, Selective Site-Directed Recombination

  • Wei Zheng
  • Karl E. Griswold
  • Chris Bailey-Kellogg
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5541)


This paper presents a new approach to site-directed recombination, swapping combinations of selected discontiguous fragments from a source protein in place of corresponding fragments of a target protein. By being both asymmetric (differentiating source and target) and selective (swapping discontiguous fragments), our method focuses experimental effort on a more restricted portion of sequence space, constructing hybrids that are more likely to have the properties that are the objective of the experiment. Furthermore, since the source and target need to be structurally homologous only locally (rather than overall), our method supports swapping fragments from functionally important regions of a source into a target “scaffold”; e.g., to humanize an exogenous therapeutic protein. A protein fragment swapping plan is defined by the residue position boundaries of the fragments to be swapped; it is assessed by an average potential score over the resulting hybrid library, with singleton and pairwise terms evaluating the importance and fit of the swapped residues. While we prove that it is NP-hard to choose an optimal set of fragments under such a potential score, we develop an integer programming approach, which we call Swagmer, that works very well in practice. We demonstrate the effectiveness of our method in two types of swapping problem: selective recombination between beta-lactamases and activity swapping between glutathione transferases. We show that the selective recombination approach generates a better plan (in terms of resulting potential score) than a traditional site-directed recombination approach. We also show that in both cases the optimized experiment is significantly better than one that would result from stochastic methods.


Protein Fragment Glutathione Transferase Residue Pair Complementary Pair Residue Position 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Stemmer, W.: Rapid evolution of a protein in vitro by DNA shuffling. Nature 370, 389–391 (1994)CrossRefPubMedGoogle Scholar
  2. 2.
    Ostermeier, M., Shim, J., Benkovic, S.: A combinatorial approach to hybrid enzymes independent of DNA homology. Nat. Biotechnol. 17, 1205–1209 (1999)CrossRefPubMedGoogle Scholar
  3. 3.
    Lutz, S., Ostermeier, M., Moore, G., Maranas, C., Benkovic, S.: Creating multiple-crossover DNA libraries independent of sequence identity. PNAS 98, 11248–11253 (2001)CrossRefPubMedPubMedCentralGoogle Scholar
  4. 4.
    Voigt, C., Martinez, C., Wang, Z., Mayo, S., Arnold, F.: Protein building blocks preserved by recombination. Nat. Struct. Biol. 9, 553–558 (2002)PubMedGoogle Scholar
  5. 5.
    O’Maille, P., Bakhtina, M., Tsai, M.: Structure-based combinatorial protein engineering (SCOPE). J. Mol. Biol. 321, 677–691 (2002)CrossRefPubMedGoogle Scholar
  6. 6.
    Aguinaldo, A., Arnold, F.: Staggered extension process (StEP) in vitro recombination. Methods Mol. Biol. 231, 105–110 (2003)PubMedGoogle Scholar
  7. 7.
    Coco, W.: RACHITT: Gene family shuffling by random chimeragenesis on transient templates. Methods Mol. Biol. 231, 111–127 (2003)PubMedGoogle Scholar
  8. 8.
    Otey, C., Silberg, J., Voigt, C., Endelman, J., Bandara, G., Arnold, F.: Functional evolution and structural conservation in chimeric cytochromes P450: calibrating a structure-guided approach. Chem. Biol. 11, 309–318 (2004)CrossRefPubMedGoogle Scholar
  9. 9.
    Castle, L., Siehl, D., Gorton, R., Patten, P., Chen, Y., Bertain, S., Cho, H.J., Duck, N., Wong, J., Liu, D., Lassner, M.: Discovery and directed evolution of a glyphosate tolerance gene. Science 304, 1151–1154 (2004)CrossRefPubMedGoogle Scholar
  10. 10.
    Griswold, K., Kawarasaki, Y., Ghoneim, N., Benkovic, S., Iverson, B., Georgiou, G.: Evolution of highly active enzymes by homology-independent recombination. PNAS 102, 10082–10087 (2005)CrossRefPubMedPubMedCentralGoogle Scholar
  11. 11.
    Griswold, K., Aiyappan, N., Iverson, B., Georgioiu, G.: The evolution of catalytic efficiency and substrate promiscuity in human theta class 1-1 glutathione transferase. J. Mol. Biol. 364, 400–410 (2006)CrossRefPubMedPubMedCentralGoogle Scholar
  12. 12.
    Taly, V., Urban, P., Truan, G., Pompon, D.: A combinatorial approach to substrate discrimination in the P450 CYP1A subfamily. Biochim. Biophys. Acta 1770, 446–457 (2006)CrossRefPubMedGoogle Scholar
  13. 13.
    Kurtovic, S., Modén, O., Shokeer, A., Mannervik, B.: Structural determinanats of glutathione transferases with azathioprine activity identified by DNA shuffling of alpha class members. J. Mol. Biol. 375, 1365–1379 (2008)CrossRefPubMedGoogle Scholar
  14. 14.
    Morrison, S., Johnson, M., Herzenberg, L., Oi, V.: Chimeric human antibody molecules: Mouse antigen-binding domains with human constant region domains. PNAS 81, 6851–6855 (1984)CrossRefPubMedPubMedCentralGoogle Scholar
  15. 15.
    Jones, P., Dear, P., Foote, J., Neuberger, M., Winter, G.: Replacing the complementarity-determining regions in a human antibody with those from a mouse. Nature 321, 522–525 (1986)CrossRefPubMedGoogle Scholar
  16. 16.
    Meyer, M., Silberg, J., Voigt, C., Endelman, J., Mayo, S., Wang, Z., Arnold, F.: Library analysis of SCHEMA-guided protein recombination. Protein Sci. 12, 1686–1693 (2003)CrossRefPubMedPubMedCentralGoogle Scholar
  17. 17.
    Moore, G., Maranas, C.: Identifying residue-residue clashes in protein hybrids by using a second-order mean-field approach. PNAS 100, 5091–5096 (2003)CrossRefPubMedPubMedCentralGoogle Scholar
  18. 18.
    Saraf, M., Horswill, A., Benkovic, S., Maranas, C.: Famclash: A method for ranking the activity of engineered enzymes. PNAS 12, 4142–4147 (2004)CrossRefGoogle Scholar
  19. 19.
    Saftalov, L., Smith, P., Friedman, A., Bailey-Kellogg, C.: Site-directed combinatorial construction of chimaeric genes: general method for optimizing assembly of gene fragments. Proteins 64, 629–642 (2006)CrossRefPubMedGoogle Scholar
  20. 20.
    Avramova, L., Desai, J., Weaver, S., Friedman, A., Bailey-Kellogg, C.: Robotic hierarchical mixing for the production of combinatorial libraries of proteins and small molecules. J. Comb. Chem. 10, 63–68 (2008)CrossRefPubMedGoogle Scholar
  21. 21.
    Otey, C., Landwehr, M., Endelman, J., Hiraga, K., Bloom, J., Arnold, F.: Structure-guided recombination creates an artificial family of cytochromes P450. PLoS Biol. 4, e112 (2006)CrossRefGoogle Scholar
  22. 22.
    Holm, L., Sander, C.: Protein structure comparison by alignment of distance matrices. J. Mol. Biol. 233, 123–138 (1993)CrossRefPubMedGoogle Scholar
  23. 23.
    Shindyalov, J., Bourne, P.: Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. Protein Eng. 11, 739–747 (1998)CrossRefPubMedGoogle Scholar
  24. 24.
    Ye, Y., Godzik, A.: Flexible structure alignment by chaining aligned fragment pairs allowing twists. Bioinformatics (suppl. 2), ii246–ii255 (2003)Google Scholar
  25. 25.
    Nussinov, R., Wolfson, H.: Efficient detection of three-dimensional motifs in biological macromolecules by computer vision techniques. PNAS 88, 10495–10499 (1992)CrossRefGoogle Scholar
  26. 26.
    Saraf, M., Gupta, A., Maranas, C.: Design of combinatorial protein libraries of optimal size. Proteins 60, 769–777 (2005)CrossRefPubMedGoogle Scholar
  27. 27.
    Russ, W., Lowery, D., Mishra, P., Yaffee, M., Ranganathan, R.: Natural-like function in artificial WW domains. Nature 437, 579–583 (2005)CrossRefPubMedGoogle Scholar
  28. 28.
    Socolich, M., Lockless, S., Russ, W., Lee, H., Gardner, K., Ranganathan, R.: Evolutionary information for specifying a protein fold. Nature 437, 512–518 (2005)CrossRefPubMedGoogle Scholar
  29. 29.
    Ye, X., Friedman, A., Bailey-Kellogg, C.: Hypergraph model of multi-residue interactions in proteins: sequentially-constrained partitioning algorithms for optimization of site-directed protein recombination. J. Comput. Biol. 14, 777–790 (2007); Conference version: Proc. RECOMB, pp. 15–29 (2006)CrossRefPubMedGoogle Scholar
  30. 30.
    Thomas, J., Ramakrishnan, N., Bailey-Kellogg, C.: Graphical models of residue coupling in protein families. IEEE/ACM Trans. Comput. Biol. Bioinf. 5, 183–197 (2008)CrossRefGoogle Scholar
  31. 31.
    Tanaka, S., Scheraga, H.: Medium and long range interaction parameters between amino acids for predicting three dimensional strutures of proteins. Macromolecules 9, 945–950 (1976)CrossRefPubMedGoogle Scholar
  32. 32.
    Miyazawa, S., Jernigan, R.: Estimation of effective interresidue contact energies from protein crystal structures: Quasi-chemical approximation. Macromolecules 18, 531–552 (1985)CrossRefGoogle Scholar
  33. 33.
    Bowie, J., Luthy, R., Eisenberg, D.: A method to identify protein sequences that fold into a known three-dimensional structure. Science 253, 164–170 (1991)CrossRefPubMedGoogle Scholar
  34. 34.
    Jones, D., Taylor, W., Thornton, J.: A new approach to protein fold recognition. Nature 358, 86–89 (1992)CrossRefPubMedGoogle Scholar
  35. 35.
    Lathrop, R., Smith, T.: Global optimum protein threading with gapped alignment and empirical pair score functions. J. Mol. Biol. 255, 651–665 (1996)CrossRefGoogle Scholar
  36. 36.
    Godzik, A.: Fold recognition methods. Methods Biochem. Anal. 44, 525–546 (2003)PubMedGoogle Scholar
  37. 37.
    Lathrop, R.: The protein threading problem with sequence amino acid interaction preferences is NP-complete. Protein Eng. 7, 1059–1068 (1994)CrossRefPubMedGoogle Scholar
  38. 38.
    Xu, J., Li, M., Kim, D., Xu, Y.: RAPTOR: Optimal protein threading by linear programming. J. Bioinf. Comp. Biol. 1, 95–117 (2003)CrossRefGoogle Scholar
  39. 39.
    Zheng, W., Friedman, A., Bailey-Kellogg, C.: Algorithms for joint optimization of stability and diversity in planning combinatorial libraries of chimeric proteins. In: Vingron, M., Wong, L. (eds.) RECOMB 2008. LNCS (LNBI), vol. 4955, pp. 300–314. Springer, Heidelberg (2008)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Wei Zheng
    • 1
  • Karl E. Griswold
    • 2
  • Chris Bailey-Kellogg
    • 1
  1. 1.Department of Computer ScienceDartmouth CollegeHanoverUSA
  2. 2.Thayer School of EngineeringDartmouth CollegeHanoverUSA

Personalised recommendations