Abstract
One main challenge in Computational Protein Design (CPD) lies in the exploration of the amino-acid sequence space, while considering, to some extent, side chain flexibility. The exorbitant size of the search space urges for the development of efficient exact deterministic search methods enabling identification of low-energy sequence-conformation models, corresponding either to the global minimum energy conformation (GMEC) or an ensemble of guaranteed near-optimal solutions. In contrast to stochastic local search methods that are not guaranteed to find the GMEC, exact deterministic approaches always identify the GMEC and prove its optimality in finite but exponential worst-case time. After a brief overview on these two classes of methods, we discuss the grounds and merits of four deterministic methods that have been applied to solve CPD problems. These approaches are based either on the Dead-End-Elimination theorem combined with A* algorithm (DEE/A*), on Cost Function Networks algorithms (CFN), on Integer Linear Programming solvers (ILP) or on Markov Random Fields solvers (MRF). The way two of these methods (DEE/A* and CFN) can be used in practice to identify low-energy sequence-conformation models starting from a pairwise decomposed energy matrix is detailed in this review.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Shapovalov MV, Dunbrack RL Jr (2011) A smoothed backbone-dependent rotamer library for proteins derived from adaptive kernel density estimates and regressions. Structure 19(6):844–858. doi:10.1016/j.str.2011.03.019
Bernstein FC, Koetzle TF, Williams GJ, Meyer EF Jr, Brice MD, Rodgers JR, Kennard O, Shimanouchi T, Tasumi M (1977) The Protein Data Bank. A computer-based archival file for macromolecular structures. Eur J Biochem 80(2):319–324
Boas FE, Harbury PB (2007) Potential energy functions for protein design. Curr Opin Struct Biol 17(2):199–204. doi:10.1016/j.sbi.2007.03.006
Desmet J, De Maeyer M, Hazes B, Lasters I (1992) The dead-end elimination theorem and its use in protein side-chain positioning. Nature 356(6369):539–542
Gainza P, Roberts KE, Donald BR (2012) Protein design using continuous rotamers. PLoS Comput Biol 8(1), e1002335
Georgiev I, Donald BR (2007) Dead-end elimination with backbone flexibility. Bioinformatics 23(13):185–194
Ma H, Keedy DA, Donald BR (2013) Dead-end elimination with perturbations (DEEPer): a provable protein design algorithm with continuous sidechain and backbone flexibility. Proteins 81(1):18–39. doi:10.1002/prot.24150
Pierce NA, Winfree E (2002) Protein design is NP-hard. Protein Eng 15(10):779–782. doi:10.1093/protein/15.10.779
Chazelle B, Kingsford C, Singh M (2004) A semidefinite programming approach to side chain positioning with new rounding strategies. Informs J Comput 16(4):380–392
Kuhlman B, Baker D (2000) Native protein sequences are close to optimal for their structures. Proc Natl Acad Sci U S A 97(19):10383–10388
Voigt CA, Gordon DB, Mayo SL (2000) Trading accuracy for speed: a quantitative comparison of search algorithms in protein sequence design. J Mol Biol 299(3):789–803. doi:10.1006/jmbi.2000.3758
Raha K, Wollacott AM, Italia MJ, Desjarlais JR (2000) Prediction of amino acid sequence from structure. Protein Sci 9(6):1106–1119. doi:10.1110/ps.9.6.1106
Ogata K, Jaramillo A, Cohen W, Briand J, Conan F, Wodak S (2003) Automatic sequence design of MHC class-I binding peptides impairing CD8+ T cell recognition. J Biol Chem 278:1281
Allen BD, Mayo SL (2006) Dramatic performance enhancements for the FASTER optimization algorithm. J Comput Chem 27(10):1071–1075
Desmet J, Spriet J, Lasters I (2002) Fast and accurate side-chain topology and energy refinement (FASTER) as a new method for protein structure optimization. Proteins 48(1):31–43. doi:10.1002/prot.10131
Leaver-Fay A, Tyka M, Lewis SM, Lange OF, Thompson J, Jacak R, Kaufman K, Renfrew PD, Smith CA, Sheffler W, Davis IW, Cooper S, Treuille A, Mandell DJ, Richter F, Ban YE, Fleishman SJ, Corn JE, Kim DE, Lyskov S, Berrondo M, Mentzer S, Popovic Z, Havranek JJ, Karanicolas J, Das R, Meiler J, Kortemme T, Gray JJ, Kuhlman B, Baker D, Bradley P (2011) ROSETTA3: an object-oriented software suite for the simulation and design of macromolecules. Methods Enzymol 487:545–574. doi:10.1016/B978-0-12-381270-4.00019-6
Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller E (1953) Equation of state calculations by fast computing machines. J Chem Phys 21:1087
Chowdry AB, Reynolds KA, Hanes MS, Voorhies M, Pokala N, Handel TM (2007) An object-oriented library for computational protein design. J Comput Chem 28(14):2378–2388. doi:10.1002/jcc.20727
Allouche D, André I, Barbe S, Davies J, de Givry S, Katsirelos G, O'Sullivan B, Prestwich S, Schiex T, Traoré S (2014) Computational protein design as an optimization problem. Artif Intell 212:59–79. doi:10.1016/j.artint.2014.03.005
Dahiyat BI, Mayo SL (1996) Protein design automation. Protein Sci 5(5):895–903
Leach AR, Lemon AP (1998) Exploring the conformational space of protein side chains using dead-end elimination and the A* algorithm. Proteins 33(2):227–239
Georgiev I, Lilien RH, Donald BR (2008) The minimized dead-end elimination criterion and its application to protein redesign in a hybrid scoring and search algorithm for computing partition functions over molecular ensembles. J Comput Chem 29(10):1527–1542
Goldstein RF (1994) Efficient rotamer elimination applied to protein side-chains and related spin glasses. Biophys J 66(5):1335–1340
Pierce NA, Spriet JA, Desmet J, Mayo SL (2000) Conformational splitting: a more powerful criterion for dead-end elimination. J Comput Chem 21(11):999
Looger LL, Hellinga HW (2001) Generalized dead-end elimination algorithms make large-scale protein side-chain structure prediction tractable: implications for protein design and structural genomics. J Mol Biol 307(1):429–445. doi:10.1006/jmbi.2000.4424
Georgiev I, Lilien RH, Donald BR (2006) Improved pruning algorithms and divide-and-conquer strategies for dead-end elimination, with application to protein design. Bioinformatics 22(14):E174–E183. doi:10.1093/bioinformatics/btl220
Chen C-Y, Georgiev I, Anderson AC, Donald BR (2009) Computational structure-based redesign of enzyme activity. Proc Natl Acad Sci 106(10):3764–3769
Gainza P, Roberts KE, Georgiev I, Lilien RH, Keedy DA, Chen CY, Reza F, Anderson AC, Richardson DC, Richardson JS, Donald BR (2013) Osprey: protein design with ensembles, flexibility, and provable algorithms. Methods Enzymol 523:87–107. doi:10.1016/B978-0-12-394292-0.00005-9
Schiex T, Fargier H, Verfaillie G (1995) Valued constraint satisfaction problems: hard and easy problems. Int Joint Conf Artif Intell 14:631–639
Cooper M, Schiex T (2004) Arc consistency for soft constraints. Artif Intell 154(1):199–227
Larrosa J, Schiex T (2004) Solving weighted CSP by maintaining arc consistency. Artif Intell 159(1):1–26
Cooper M, Givry Sd, Schiex T (2006) The quest for the best arc consistent closure in weighted CSP. In: 8th International CP-06 workshop on preferences and soft constraints, Nantes, France
Otten L, Dechter R (2012) Anytime {AND/OR} depth-first search for combinatorial optimization. Artif Intell Commun 25(3):211–227
Sontag D, Choe DK, Li Y (2012) Efficiently searching for frustrated cycles in {MAP} inference. AUAI Press, Corvallis, OR, pp 795–804
Allouche D, Traoré S, André I, de Givry S, Katsirelos G, Barbe S, Schiex T (2012) Computational protein design as a cost function network optimization problem CP 2012
Traoré S, Allouche D, André I, de Givry S, Katsirelos G, Schiex T, Barbe S (2013) A new framework for computational protein design through cost function network optimization. Bioinformatics. doi:10.1093/bioinformatics/btt374
Koster AMCA, van Hoesel SPM, Kolen AWJ (1999) Solving frequency assignment problems via tree-decomposition. Electron Notes Discrete Math 3:102
Kingsford CL, Chazelle B, Singh M (2005) Solving and analyzing side-chain positioning problems using linear and integer programming. Bioinformatics (Oxford) 21(7):1028–1036. doi:10.1093/bioinformatics/bti144
Zhou Y, Wu Y, Zeng J (2015) Computational protein design using AND/OR branch-and-bound search. In: Przytycka TM (ed) Research in computational molecular biology, vol 9029, Lecture notes in computer science. Springer, New York, NY, pp 354–366. doi:10.1007/978-3-319-16706-0_36
Khoury GA, Smadbeck J, Kieslich CA, Floudas CA (2014) Protein folding and \emph{de novo} protein design for biotechnological applications. Trends Biotechnol 32(2):99–109
Yanover C, Meltzer T, Weiss Y (2006) Linear programming relaxations and belief propagation—an empirical study. J Mach Learn Res 7:1887–1907
De Givry S, Heras F, Zytnicki M, Larrosa J (2005) Existential arc consistency: getting closer to full arc consistency in weighted CSPs. In: IJCAI'05 proceedings of the 19th international joint conference on Artificial intelligence
Lecoutre C, Saïs L, Tabary S, Vidal V (2009) Reasoning from last conflict(s) in constraint programming. Artif Intell 173:1592–1614
Dechter R, Mateescu R (2007) {AND/OR} search spaces for graphical models. Artif intell 171(2):73–106
Dechter R, Rish I (2003) Mini-buckets: a general scheme for bounded inference. J ACM 50(2):107–153
Schiex T (2000) Valued constraint networks. In: Proceedings of the 6th conference on principles and practice of constraint programming
Globerson A, Jaakkola TS (2007) Fixing max-product: convergent message passing algorithms for MAP LP-relaxations. In: NIPS’07 Proceedings of the 20th international conference on neural information processing systems, pp 553–560
Sontag D, Meltzer T, Globerson A, Weiss Y, Jaakkola T (2008) Tightening {LP} relaxations for {MAP} using message-passing. AUAI Press, Corvallis, OR, pp 503–510
Acknowledgments
This work has been funded by a grant from INRA and the Region Midi-Pyrénées and the “Agence Nationale de la Recherche,” references ANR 10-BLA-0214 and ANR-12-MONU-0015-03. We thank the Computing Center of Region Midi-Pyrénées (CALMIP, Toulouse, France) and the GenoToul Bioinformatics Platform of INRA-Toulouse for providing computing resources and support.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Science+Business Media New York
About this protocol
Cite this protocol
Traoré, S., Allouche, D., André, I., Schiex, T., Barbe, S. (2017). Deterministic Search Methods for Computational Protein Design. In: Samish, I. (eds) Computational Protein Design. Methods in Molecular Biology, vol 1529. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-6637-0_4
Download citation
DOI: https://doi.org/10.1007/978-1-4939-6637-0_4
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-6635-6
Online ISBN: 978-1-4939-6637-0
eBook Packages: Springer Protocols