Advertisement

High-Throughput Reconstruction of Ancestral Protein Sequence, Structure, and Molecular Function

  • Kelsey Aadland
  • Charles Pugh
  • Bryan Kolaczkowski
Protocol
Part of the Methods in Molecular Biology book series (MIMB, volume 1851)

Abstract

Ancestral protein sequence reconstruction is a powerful technique for explicitly testing hypotheses about the evolution of molecular function, allowing researchers to meticulously dissect how historical changes in protein sequence impacted functional repertoire by altering the protein’s 3D structure. These techniques have provided concrete, experimentally validated insights into ancient evolutionary processes and help illuminate the complex relationship between protein sequence, structure, and function. Inferring the protein family phylogenies on which ancestral sequence reconstruction depends and reconstructing the sequences, themselves, are amenable to high-throughput computational analysis. However, determining the structures of ancestral-reconstructed proteins and characterizing their functions typically rely on time-consuming and expensive laboratory analyses, limiting most current studies to examining a relatively small number of specific hypotheses. For this reason, we have little detailed, unbiased information about how molecular function evolves across large protein family phylogenies. Here we describe a generalized protocol that integrates ancestral sequence reconstruction with structural homology modeling and structure-based molecular affinity prediction to characterize historical changes in protein function across families with thousands of individual sequences. We highlight key steps in the analysis protocol requiring particularly careful attention to avoid introducing potential errors as well as steps for which computationally efficient subroutines can be substituted for more intensive approaches, allowing researchers to scale the analysis up or down, depending on available resources and requirements for reproducibility and scientific rigor. In our view, this approach provides a compelling compliment to more laboratory-intensive procedures, generating important contextual information that can help guide detailed experiments.

Key words

Ancestral sequence reconstruction Structural modeling Protein function prediction Affinity prediction Protein evolution Molecular evolution 

Supplementary material

426856_1_En_8_MOESM1_ESM.zip (10 kb)
Data 1 Python scripts and data required for the presented examples. (ZIP 10 KB)

References

  1. 1.
    Dean AM, Thornton JW (2007) Mechanistic approaches to the study of evolution: the functional synthesis. Nat Rev Genet 8(9):675–688.  https://doi.org/10.1038/nrg2160CrossRefPubMedPubMedCentralGoogle Scholar
  2. 2.
    Harms MJ, Thornton JW (2013) Evolutionary biochemistry: revealing the historical and physical causes of protein properties. Nat Rev Genet 14(8):559–571.  https://doi.org/10.1038/nrg3540CrossRefPubMedPubMedCentralGoogle Scholar
  3. 3.
    Cole MF, Gaucher EA (2011) Exploiting models of molecular evolution to efficiently direct protein engineering. J Mol Evol 72(2):193–203.  https://doi.org/10.1007/s00239-010-9415-2CrossRefPubMedGoogle Scholar
  4. 4.
    Ogawa T, Shirai T (2014) Tracing ancestral specificity of lectins: ancestral sequence reconstruction method as a new approach in protein engineering. Methods Mol Biol 1200:539–551.  https://doi.org/10.1007/978-1-4939-1292-6_44CrossRefPubMedGoogle Scholar
  5. 5.
    Yang Z, Kumar S, Nei M (1995) A new method of inference of ancestral nucleotide and amino acid sequences. Genetics 141(4):1641–1650PubMedPubMedCentralGoogle Scholar
  6. 6.
    Shih P, Malcolm BA, Rosenberg S, Kirsch JF, Wilson AC (1993) Reconstruction and testing of ancestral proteins. Methods Enzymol 224:576–590CrossRefGoogle Scholar
  7. 7.
    Zmasek CM, Godzik A (2011) Strong functional patterns in the evolution of eukaryotic genomes revealed by the reconstruction of ancestral protein domain repertoires. Genome Biol 12(1):R4.  https://doi.org/10.1186/gb-2011-12-1-r4CrossRefPubMedPubMedCentralGoogle Scholar
  8. 8.
    Whitfield JH, Zhang WH, Herde MK, Clifton BE, Radziejewski J, Janovjak H, Henneberger C, Jackson CJ (2015) Construction of a robust and sensitive arginine biosensor through ancestral protein reconstruction. Protein Sci 24(9):1412–1422.  https://doi.org/10.1002/pro.2721CrossRefPubMedPubMedCentralGoogle Scholar
  9. 9.
    Malcolm BA, Wilson KP, Matthews BW, Kirsch JF, Wilson AC (1990) Ancestral lysozymes reconstructed, neutrality tested, and thermostability linked to hydrocarbon packing. Nature 345(6270):86–89.  https://doi.org/10.1038/345086a0CrossRefPubMedPubMedCentralGoogle Scholar
  10. 10.
    Clifton BE, Jackson CJ (2016) Ancestral protein reconstruction yields insights into adaptive evolution of binding specificity in solute-binding proteins. Cell Chem Biol 23(2):236–245.  https://doi.org/10.1016/j.chembiol.2015.12.010CrossRefPubMedGoogle Scholar
  11. 11.
    Bridgham JT, Carroll SM, Thornton JW (2006) Evolution of hormone-receptor complexity by molecular exploitation. Science 312(5770):97–101.  https://doi.org/10.1126/science.1123348CrossRefPubMedGoogle Scholar
  12. 12.
    Bridgham JT, Ortlund EA, Thornton JW (2009) An epistatic ratchet constrains the direction of glucocorticoid receptor evolution. Nature 461(7263):515–519.  https://doi.org/10.1038/nature08249CrossRefPubMedPubMedCentralGoogle Scholar
  13. 13.
    Voordeckers K, Brown CA, Vanneste K, van der Zande E, Voet A, Maere S, Verstrepen KJ (2012) Reconstruction of ancestral metabolic enzymes reveals molecular mechanisms underlying evolutionary innovation through gene duplication. PLoS Biol 10(12):e1001446.  https://doi.org/10.1371/journal.pbio.1001446CrossRefPubMedPubMedCentralGoogle Scholar
  14. 14.
    Ugalde JA, Chang BS, Matz MV (2004) Evolution of coral pigments recreated. Science 305(5689):1433.  https://doi.org/10.1126/science.1099597CrossRefPubMedGoogle Scholar
  15. 15.
    van Hazel I, Sabouhanian A, Day L, Endler JA, Chang BS (2013) Functional characterization of spectral tuning mechanisms in the great bowerbird short-wavelength sensitive visual pigment (SWS1), and the origins of UV/violet vision in passerines and parrots. BMC Evol Biol 13:250.  https://doi.org/10.1186/1471-2148-13-250CrossRefPubMedPubMedCentralGoogle Scholar
  16. 16.
    Hall BG (2006) Simple and accurate estimation of ancestral protein sequences. Proc Natl Acad Sci U S A 103(14):5431–5436.  https://doi.org/10.1073/pnas.0508991103CrossRefPubMedPubMedCentralGoogle Scholar
  17. 17.
    Ashkenazy H, Penn O, Doron-Faigenboim A, Cohen O, Cannarozzi G, Zomer O, Pupko T (2012) FastML: a web server for probabilistic reconstruction of ancestral sequences. Nucleic Acids Res 40(Web Server issue):W580–W584.  https://doi.org/10.1093/nar/gks498CrossRefPubMedPubMedCentralGoogle Scholar
  18. 18.
    Redelings BD, Suchard MA (2005) Joint Bayesian estimation of alignment and phylogeny. Syst Biol 54(3):401–418.  https://doi.org/10.1080/10635150590947041CrossRefPubMedGoogle Scholar
  19. 19.
    Suchard MA, Redelings BD (2006) BAli-Phy: simultaneous Bayesian inference of alignment and phylogeny. Bioinformatics 22(16):2047–2048.  https://doi.org/10.1093/bioinformatics/btl175CrossRefPubMedGoogle Scholar
  20. 20.
    Anderson DP, Whitney DS, Hanson-Smith V, Woznica A, Campodonico-Burnett W, Volkman BF, King N, Thornton JW, Prehoda KE (2016) Evolution of an ancient protein function involved in organized multicellularity in animals. Elife 5:e10147.  https://doi.org/10.7554/eLife.10147CrossRefPubMedPubMedCentralGoogle Scholar
  21. 21.
    Thornton JW (2004) Resurrecting ancient genes: experimental analysis of extinct molecules. Nat Rev Genet 5(5):366–375.  https://doi.org/10.1038/nrg1324CrossRefPubMedGoogle Scholar
  22. 22.
    Chang BS, Jonsson K, Kazmi MA, Donoghue MJ, Sakmar TP (2002) Recreating a functional ancestral archosaur visual pigment. Mol Biol Evol 19(9):1483–1489CrossRefGoogle Scholar
  23. 23.
    Williams PD, Pollock DD, Blackburne BP, Goldstein RA (2006) Assessing the accuracy of ancestral protein reconstruction methods. PLoS Comput Biol 2(6):e69.  https://doi.org/10.1371/journal.pcbi.0020069CrossRefPubMedPubMedCentralGoogle Scholar
  24. 24.
    Matsumoto T, Akashi H, Yang Z (2015) Evaluation of ancestral sequence reconstruction methods to infer nonstationary patterns of nucleotide substitution. Genetics 200(3):873–890.  https://doi.org/10.1534/genetics.115.177386CrossRefPubMedPubMedCentralGoogle Scholar
  25. 25.
    Susko E, Roger AJ (2013) Problems with estimation of ancestral frequencies under stationary models. Syst Biol 62(2):330–338.  https://doi.org/10.1093/sysbio/sys075CrossRefPubMedGoogle Scholar
  26. 26.
    Pollock DD, Chang BS (2007) Dealing with uncertainty in ancestral sequence reconstruction: sampling from the posterior distribution. In: Liberles DA (ed) Ancestral sequence reconstruction. Oxford University Press, OxfordGoogle Scholar
  27. 27.
    Dias R, Manny A, Kolaczkowski O, Kolaczkowski B (2017) Convergence of domain architecture, structure, and ligand affinity in animal and plant RNA-binding proteins. Mol Biol Evol 34(6):1429–1444.  https://doi.org/10.1093/molbev/msx090CrossRefPubMedPubMedCentralGoogle Scholar
  28. 28.
    Randall RN, Radford CE, Roof KA, Natarajan DK, Gaucher EA (2016) An experimental phylogeny to benchmark ancestral sequence reconstruction. Nat Commun 7:12847.  https://doi.org/10.1038/ncomms12847CrossRefPubMedPubMedCentralGoogle Scholar
  29. 29.
    Hanson-Smith V, Kolaczkowski B, Thornton JW (2010) Robustness of ancestral sequence reconstruction to phylogenetic uncertainty. Mol Biol Evol 27(9):1988–1999.  https://doi.org/10.1093/molbev/msq081CrossRefPubMedPubMedCentralGoogle Scholar
  30. 30.
    Kolaczkowski B, Thornton JW (2004) Performance of maximum parsimony and likelihood phylogenetics when evolution is heterogeneous. Nature 431(7011):980–984.  https://doi.org/10.1038/nature02917CrossRefPubMedGoogle Scholar
  31. 31.
    Blanquart S, Lartillot N (2006) A Bayesian compound stochastic process for modeling nonstationary and nonhomogeneous sequence evolution. Mol Biol Evol 23(11):2058–2071.  https://doi.org/10.1093/molbev/msl091CrossRefPubMedGoogle Scholar
  32. 32.
    Blanquart S, Lartillot N (2008) A site- and time-heterogeneous model of amino acid replacement. Mol Biol Evol 25(5):842–858.  https://doi.org/10.1093/molbev/msn018CrossRefPubMedGoogle Scholar
  33. 33.
    Risso VA, Gavira JA, Mejia-Carmona DF, Gaucher EA, Sanchez-Ruiz JM (2013) Hyperstability and substrate promiscuity in laboratory resurrections of Precambrian beta-lactamases. J Am Chem Soc 135(8):2899–2902.  https://doi.org/10.1021/ja311630aCrossRefPubMedGoogle Scholar
  34. 34.
    Korithoski B, Kolaczkowski O, Mukherjee K, Kola R, Earl C, Kolaczkowski B (2015) Evolution of a novel antiviral immune-signaling interaction by partial-gene duplication. PLoS One 10(9):e0137276.  https://doi.org/10.1371/journal.pone.0137276CrossRefPubMedPubMedCentralGoogle Scholar
  35. 35.
    Pugh C, Kolaczkowski O, Manny A, Korithoski B, Kolaczkowski B (2016) Resurrecting ancestral structural dynamics of an antiviral immune receptor: adaptive binding pocket reorganization repeatedly shifts RNA preference. BMC Evol Biol 16(1):241.  https://doi.org/10.1186/s12862-016-0818-6CrossRefPubMedPubMedCentralGoogle Scholar
  36. 36.
    Finnigan GC, Hanson-Smith V, Stevens TH, Thornton JW (2012) Evolution of increased complexity in a molecular machine. Nature 481(7381):360–364.  https://doi.org/10.1038/nature10724CrossRefPubMedPubMedCentralGoogle Scholar
  37. 37.
    Kratzer JT, Lanaspa MA, Murphy MN, Cicerchi C, Graves CL, Tipton PA, Ortlund EA, Johnson RJ, Gaucher EA (2014) Evolutionary history and metabolic insights of ancient mammalian uricases. Proc Natl Acad Sci U S A 111(10):3763–3768.  https://doi.org/10.1073/pnas.1320393111CrossRefPubMedPubMedCentralGoogle Scholar
  38. 38.
    Ortlund EA, Bridgham JT, Redinbo MR, Thornton JW (2007) Crystal structure of an ancient protein: evolution by conformational epistasis. Science 317(5844):1544–1548.  https://doi.org/10.1126/science.1142819CrossRefPubMedPubMedCentralGoogle Scholar
  39. 39.
    Marchler-Bauer A, Derbyshire MK, Gonzales NR, Lu S, Chitsaz F, Geer LY, Geer RC, He J, Gwadz M, Hurwitz DI, Lanczycki CJ, Lu F, Marchler GH, Song JS, Thanki N, Wang Z, Yamashita RA, Zhang D, Zheng C, Bryant SH (2015) CDD: NCBI’s conserved domain database. Nucleic Acids Res 43(Database issue):D222–D226.  https://doi.org/10.1093/nar/gku1221CrossRefPubMedGoogle Scholar
  40. 40.
    Finn RD, Bateman A, Clements J, Coggill P, Eberhardt RY, Eddy SR, Heger A, Hetherington K, Holm L, Mistry J, Sonnhammer EL, Tate J, Punta M (2014) Pfam: the protein families database. Nucleic Acids Res 42(Database issue):D222–D230.  https://doi.org/10.1093/nar/gkt1223CrossRefPubMedPubMedCentralGoogle Scholar
  41. 41.
    Yue F, Shi J, Tang J (2009) Simultaneous phylogeny reconstruction and multiple sequence alignment. BMC Bioinformatics 10(Suppl 1):S11.  https://doi.org/10.1186/1471-2105-10-S1-S11CrossRefPubMedPubMedCentralGoogle Scholar
  42. 42.
    Fleissner R, Metzler D, von Haeseler A (2005) Simultaneous statistical multiple alignment and phylogeny reconstruction. Syst Biol 54(4):548–561.  https://doi.org/10.1080/10635150590950371CrossRefPubMedGoogle Scholar
  43. 43.
    Herman JL, Challis CJ, Novak A, Hein J, Schmidler SC (2014) Simultaneous Bayesian estimation of alignment and phylogeny under a joint model of protein sequence and structure. Mol Biol Evol 31(9):2251–2266.  https://doi.org/10.1093/molbev/msu184CrossRefPubMedPubMedCentralGoogle Scholar
  44. 44.
    Liu K, Warnow TJ, Holder MT, Nelesen SM, Yu J, Stamatakis AP, Linder CR (2012) SATe-II: very fast and accurate simultaneous estimation of multiple sequence alignments and phylogenetic trees. Syst Biol 61(1):90–106.  https://doi.org/10.1093/sysbio/syr095CrossRefPubMedGoogle Scholar
  45. 45.
    Nuin PA, Wang Z, Tillier ER (2006) The accuracy of several multiple sequence alignment programs for proteins. BMC Bioinformatics 7:471.  https://doi.org/10.1186/1471-2105-7-471CrossRefPubMedPubMedCentralGoogle Scholar
  46. 46.
    Pervez MT, Babar ME, Nadeem A, Aslam M, Awan AR, Aslam N, Hussain T, Naveed N, Qadri S, Waheed U, Shoaib M (2014) Evaluating the accuracy and efficiency of multiple sequence alignment methods. Evol Bioinformatics Online 10:205–217.  https://doi.org/10.4137/EBO.S19199CrossRefGoogle Scholar
  47. 47.
    Thompson JD, Linard B, Lecompte O, Poch O (2011) A comprehensive benchmark study of multiple sequence alignment methods: current challenges and future perspectives. PLoS One 6(3):e18093.  https://doi.org/10.1371/journal.pone.0018093CrossRefPubMedPubMedCentralGoogle Scholar
  48. 48.
    Ogden TH, Rosenberg MS (2006) Multiple sequence alignment accuracy and phylogenetic inference. Syst Biol 55(2):314–328.  https://doi.org/10.1080/10635150500541730CrossRefPubMedGoogle Scholar
  49. 49.
    Simmons MP, Muller KF, Webb CT (2011) The deterministic effects of alignment bias in phylogenetic inference. Cladistics 27(4):402–416CrossRefGoogle Scholar
  50. 50.
    Wang LS, Leebens-Mack J, Kerr Wall P, Beckmann K, dePamphilis CW, Warnow T (2011) The impact of multiple protein sequence alignment on phylogenetic estimation. IEEE/ACM Trans Comput Biol Bioinform 8(4):1108–1119.  https://doi.org/10.1109/TCBB.2009.68CrossRefPubMedGoogle Scholar
  51. 51.
    Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ, Higgins DG (2007) Clustal W and Clustal X version 2.0. Bioinformatics 23(21):2947–2948.  https://doi.org/10.1093/bioinformatics/btm404CrossRefPubMedGoogle Scholar
  52. 52.
    Sievers F, Wilm A, Dineen D, Gibson TJ, Karplus K, Li W, Lopez R, McWilliam H, Remmert M, Soding J, Thompson JD, Higgins DG (2011) Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol Syst Biol 7:539.  https://doi.org/10.1038/msb.2011.75CrossRefPubMedPubMedCentralGoogle Scholar
  53. 53.
    Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32(5):1792–1797.  https://doi.org/10.1093/nar/gkh340CrossRefPubMedPubMedCentralGoogle Scholar
  54. 54.
    Katoh K, Standley DM (2013) MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol 30(4):772–780.  https://doi.org/10.1093/molbev/mst010CrossRefPubMedPubMedCentralGoogle Scholar
  55. 55.
    Liu Y, Schmidt B, Maskell DL (2010) MSAProbs: multiple sequence alignment based on pair hidden Markov models and partition function posterior probabilities. Bioinformatics 26(16):1958–1964.  https://doi.org/10.1093/bioinformatics/btq338CrossRefPubMedGoogle Scholar
  56. 56.
    Roshan U, Livesay DR (2006) Probalign: multiple sequence alignment using partition function posterior probabilities. Bioinformatics 22(22):2715–2721.  https://doi.org/10.1093/bioinformatics/btl472CrossRefPubMedGoogle Scholar
  57. 57.
    Do CB, Mahabhashyam MS, Brudno M, Batzoglou S (2005) ProbCons: probabilistic consistency-based multiple sequence alignment. Genome Res 15(2):330–340.  https://doi.org/10.1101/gr.2821705CrossRefPubMedPubMedCentralGoogle Scholar
  58. 58.
    Notredame C, Higgins DG, Heringa J (2000) T-Coffee: a novel method for fast and accurate multiple sequence alignment. J Mol Biol 302(1):205–217.  https://doi.org/10.1006/jmbi.2000.4042CrossRefPubMedGoogle Scholar
  59. 59.
    Talavera G, Castresana J (2007) Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments. Syst Biol 56(4):564–577.  https://doi.org/10.1080/10635150701472164CrossRefGoogle Scholar
  60. 60.
    Gouveia-Oliveira R, Sackett PW, Pedersen AG (2007) MaxAlign: maximizing usable data in an alignment. BMC Bioinformatics 8:312.  https://doi.org/10.1186/1471-2105-8-312CrossRefPubMedPubMedCentralGoogle Scholar
  61. 61.
    Capella-Gutierrez S, Silla-Martinez JM, Gabaldon T (2009) trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25(15):1972–1973.  https://doi.org/10.1093/bioinformatics/btp348CrossRefPubMedPubMedCentralGoogle Scholar
  62. 62.
    Wu M, Chatterji S, Eisen JA (2012) Accounting for alignment uncertainty in phylogenomics. PLoS One 7(1):e30288.  https://doi.org/10.1371/journal.pone.0030288CrossRefPubMedPubMedCentralGoogle Scholar
  63. 63.
    Castresana J (2000) Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol Biol Evol 17(4):540–552CrossRefGoogle Scholar
  64. 64.
    Wheeler WC, Gatesy J, DeSalle R (1995) Elision: a method for accommodating multiple molecular sequence alignments with alignment-ambiguous sites. Mol Phylogenet Evol 4(1):1–9.  https://doi.org/10.1006/mpev.1995.1001CrossRefPubMedGoogle Scholar
  65. 65.
    de Queiroz A, Gatesy J (2007) The supermatrix approach to systematics. Trends Ecol Evol 22(1):34–41.  https://doi.org/10.1016/j.tree.2006.10.002CrossRefPubMedGoogle Scholar
  66. 66.
    Mar JC, Harlow TJ, Ragan MA (2005) Bayesian and maximum likelihood phylogenetic analyses of protein sequence data under relative branch-length differences and model violation. BMC Evol Biol 5:8.  https://doi.org/10.1186/1471-2148-5-8CrossRefPubMedPubMedCentralGoogle Scholar
  67. 67.
    Kolaczkowski B, Thornton JW (2009) Long-branch attraction bias and inconsistency in Bayesian phylogenetics. PLoS One 4(12):e7891.  https://doi.org/10.1371/journal.pone.0007891CrossRefPubMedPubMedCentralGoogle Scholar
  68. 68.
    Price MN, Dehal PS, Arkin AP (2010) FastTree 2--approximately maximum-likelihood trees for large alignments. PLoS One 5(3):e9490.  https://doi.org/10.1371/journal.pone.0009490CrossRefPubMedPubMedCentralGoogle Scholar
  69. 69.
    Liu K, Linder CR, Warnow T (2011) RAxML and FastTree: comparing two methods for large-scale maximum likelihood phylogeny estimation. PLoS One 6(11):e27731.  https://doi.org/10.1371/journal.pone.0027731CrossRefPubMedPubMedCentralGoogle Scholar
  70. 70.
    Stamatakis A (2014) RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30(9):1312–1313.  https://doi.org/10.1093/bioinformatics/btu033CrossRefPubMedPubMedCentralGoogle Scholar
  71. 71.
    Ripplinger J, Sullivan J (2008) Does choice in model selection affect maximum likelihood analysis? Syst Biol 57(1):76–85.  https://doi.org/10.1080/10635150801898920CrossRefPubMedGoogle Scholar
  72. 72.
    Ripplinger J, Sullivan J (2010) Assessment of substitution model adequacy using frequentist and Bayesian methods. Mol Biol Evol 27(12):2790–2803.  https://doi.org/10.1093/molbev/msq168CrossRefPubMedPubMedCentralGoogle Scholar
  73. 73.
    Darriba D, Taboada GL, Doallo R, Posada D (2011) ProtTest 3: fast selection of best-fit models of protein evolution. Bioinformatics 27(8):1164–1165.  https://doi.org/10.1093/bioinformatics/btr088CrossRefPubMedPubMedCentralGoogle Scholar
  74. 74.
    Le SQ, Gascuel O (2008) An improved general amino acid replacement matrix. Mol Biol Evol 25(7):1307–1320.  https://doi.org/10.1093/molbev/msn067CrossRefPubMedGoogle Scholar
  75. 75.
    Anisimova M, Gascuel O (2006) Approximate likelihood-ratio test for branches: a fast, accurate, and powerful alternative. Syst Biol 55(4):539–552.  https://doi.org/10.1080/10635150600755453CrossRefPubMedGoogle Scholar
  76. 76.
    Anisimova M, Gil M, Dufayard JF, Dessimoz C, Gascuel O (2011) Survey of branch support methods demonstrates accuracy, power, and robustness of fast likelihood-based approximation schemes. Syst Biol 60(5):685–699.  https://doi.org/10.1093/sysbio/syr041CrossRefPubMedPubMedCentralGoogle Scholar
  77. 77.
    Hill J, Davis KE (2014) The Supertree Toolkit 2: a new and improved software package with a Graphical User Interface for supertree construction. Biodivers Data J 2:e1053.  https://doi.org/10.3897/BDJ.2.e1053CrossRefGoogle Scholar
  78. 78.
    Pagel M, Meade A, Barker D (2004) Bayesian estimation of ancestral character states on phylogenies. Syst Biol 53(5):673–684.  https://doi.org/10.1080/10635150490522232CrossRefPubMedGoogle Scholar
  79. 79.
    Eswar N, Eramian D, Webb B, Shen MY, Sali A (2008) Protein structure modeling with MODELLER. Methods Mol Biol 426:145–159.  https://doi.org/10.1007/978-1-60327-058-8_8CrossRefPubMedGoogle Scholar
  80. 80.
    Madhusudhan MS, Webb BM, Marti-Renom MA, Eswar N, Sali A (2009) Alignment of multiple protein structures based on sequence and structure features. Protein Eng Des Sel 22(9):569–574.  https://doi.org/10.1093/protein/gzp040CrossRefPubMedPubMedCentralGoogle Scholar
  81. 81.
    Kalaimathy S, Sowdhamini R, Kanagarajadurai K (2011) Critical assessment of structure-based sequence alignment methods at distant relationships. Brief Bioinform 12(2):163–175.  https://doi.org/10.1093/bib/bbq025CrossRefPubMedGoogle Scholar
  82. 82.
    Kim C, Lee B (2007) Accuracy of structure-based sequence alignment of automatic methods. BMC Bioinformatics 8:355.  https://doi.org/10.1186/1471-2105-8-355CrossRefPubMedPubMedCentralGoogle Scholar
  83. 83.
    Ashtawy HM, Mahapatra NR (2012) A comparative assessment of ranking accuracies of conventional and machine-learning-based scoring functions for protein-ligand binding affinity prediction. IEEE/ACM Trans Comput Biol Bioinform 9(5):1301–1313.  https://doi.org/10.1109/TCBB.2012.36CrossRefPubMedGoogle Scholar
  84. 84.
    Ashtawy HM, Mahapatra NR (2015) BgN-Score and BsN-Score: bagging and boosting based ensemble neural networks scoring functions for accurate binding affinity prediction of protein-ligand complexes. BMC Bioinformatics 16(Suppl 4):S8.  https://doi.org/10.1186/1471-2105-16-S4-S8CrossRefPubMedPubMedCentralGoogle Scholar
  85. 85.
    Brylinski M (2013) Nonlinear scoring functions for similarity-based ligand docking and binding affinity prediction. J Chem Inf Model 53(11):3097–3112.  https://doi.org/10.1021/ci400510eCrossRefPubMedGoogle Scholar
  86. 86.
    Rose PW, Bi C, Bluhm WF, Christie CH, Dimitropoulos D, Dutta S, Green RK, Goodsell DS, Prlic A, Quesada M, Quinn GB, Ramos AG, Westbrook JD, Young J, Zardecki C, Berman HM, Bourne PE (2013) The RCSB Protein Data Bank: new resources for research and education. Nucleic Acids Res 41(Database issue):D475–D482.  https://doi.org/10.1093/nar/gks1200CrossRefPubMedGoogle Scholar
  87. 87.
    Comeau SR, Gatchell DW, Vajda S, Camacho CJ (2004) ClusPro: an automated docking and discrimination method for the prediction of protein complexes. Bioinformatics 20(1):45–50CrossRefGoogle Scholar
  88. 88.
    Kastritis PL, Bonvin AM (2010) Are scoring functions in protein-protein docking ready to predict interactomes? Clues from a novel binding affinity benchmark. J Proteome Res 9(5):2216–2225.  https://doi.org/10.1021/pr9009854CrossRefPubMedGoogle Scholar
  89. 89.
    Kozakov D, Beglov D, Bohnuud T, Mottarella SE, Xia B, Hall DR, Vajda S (2013) How good is automated protein docking? Proteins 81(12):2159–2166.  https://doi.org/10.1002/prot.24403CrossRefPubMedPubMedCentralGoogle Scholar
  90. 90.
    Lensink MF, Wodak SJ (2013) Docking, scoring, and affinity prediction in CAPRI. Proteins 81(12):2082–2095.  https://doi.org/10.1002/prot.24428CrossRefPubMedGoogle Scholar
  91. 91.
    Roberts VA, Thompson EE, Pique ME, Perez MS, Ten Eyck LF (2013) DOT2: macromolecular docking with improved biophysical models. J Comput Chem 34(20):1743–1758.  https://doi.org/10.1002/jcc.23304CrossRefPubMedPubMedCentralGoogle Scholar
  92. 92.
    Dolinsky TJ, Czodrowski P, Li H, Nielsen JE, Jensen JH, Klebe G, Baker NA (2007) PDB2PQR: expanding and upgrading automated preparation of biomolecular structures for molecular simulations. Nucleic Acids Res 35(Web Server issue):W522–W525.  https://doi.org/10.1093/nar/gkm276CrossRefPubMedPubMedCentralGoogle Scholar
  93. 93.
    Pronk S, Pall S, Schulz R, Larsson P, Bjelkmar P, Apostolov R, Shirts MR, Smith JC, Kasson PM, van der Spoel D, Hess B, Lindahl E (2013) GROMACS 4.5: a high-throughput and highly parallel open source molecular simulation toolkit. Bioinformatics 29(7):845–854.  https://doi.org/10.1093/bioinformatics/btt055CrossRefPubMedPubMedCentralGoogle Scholar
  94. 94.
    Dias R, Timmers LF, Caceres RA, de Azevedo WF Jr (2008) Evaluation of molecular docking using polynomial empirical scoring functions. Curr Drug Targets 9(12):1062–1070CrossRefGoogle Scholar
  95. 95.
    De Paris R, Quevedo CV, Ruiz DD, Norberto de Souza O, Barros RC (2015) Clustering molecular dynamics trajectories for optimizing docking experiments. Comput Intell Neurosci 2015:916240.  https://doi.org/10.1155/2015/916240CrossRefPubMedPubMedCentralGoogle Scholar
  96. 96.
    Seo MH, Park J, Kim E, Hohng S, Kim HS (2014) Protein conformational dynamics dictate the binding affinity for a ligand. Nat Commun 5:3724.  https://doi.org/10.1038/ncomms4724CrossRefPubMedGoogle Scholar
  97. 97.
    Kruger DM, Ignacio Garzon J, Chacon P, Gohlke H (2014) DrugScorePPI knowledge-based potentials used as scoring and objective function in protein-protein docking. PLoS One 9(2):e89466.  https://doi.org/10.1371/journal.pone.0089466CrossRefPubMedPubMedCentralGoogle Scholar
  98. 98.
    Camacho CJ, Zhang C (2005) FastContact: rapid estimate of contact and binding free energies. Bioinformatics 21(10):2534–2536.  https://doi.org/10.1093/bioinformatics/bti322CrossRefPubMedGoogle Scholar
  99. 99.
    Dias R, Kolaczkowski B (2017) Improving the accuracy of high-throughput protein-protein affinity prediction may require better training data. BMC Bioinformatics 18(Suppl 5):102.  https://doi.org/10.1186/s12859-017-1533-zCrossRefPubMedPubMedCentralGoogle Scholar
  100. 100.
    Dias R, Kolazckowski B (2015) Different combinations of atomic interactions predict protein-small molecule and protein-DNA/RNA affinities with similar accuracy. Proteins 83(11):2100–2114.  https://doi.org/10.1002/prot.24928CrossRefPubMedPubMedCentralGoogle Scholar
  101. 101.
    O’Boyle NM, Banck M, James CA, Morley C, Vandermeersch T, Hutchison GR (2011) Open Babel: an open chemical toolbox. J Cheminform 3:33.  https://doi.org/10.1186/1758-2946-3-33CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  • Kelsey Aadland
    • 1
  • Charles Pugh
    • 1
  • Bryan Kolaczkowski
    • 1
    • 2
  1. 1.Department of Microbiology & Cell Science, Institute for Food and Agricultural SciencesUniversity of FloridaGainesvilleUSA
  2. 2.Genetics InstituteUniversity of FloridaGainesvilleUSA

Personalised recommendations