Using Kendall-τ Meta-Bagging to Improve Protein-Protein Docking Predictions

  • Jérôme Azé
  • Thomas Bourquard
  • Sylvie Hamel
  • Anne Poupon
  • David W. Ritchie
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7036)


Predicting the three-dimensional (3D) structures of macromolecular protein-protein complexes from the structures of individual partners (docking), is a major challenge for computational biology. Most docking algorithms use two largely independent stages. First, a fast sampling stage generates a large number (millions or even billions) of candidate conformations, then a scoring stage evaluates these conformations and extracts a small ensemble amongst which a good solution is assumed to exist. Several strategies have been proposed for this stage. However, correctly distinguishing and discarding false positives from the native biological interfaces remains a difficult task. Here, we introduce a new scoring algorithm based on learnt bootstrap aggregation (“bagging”) models of protein shape complementarity. 3D Voronoi diagrams are used to describe and encode the surface shapes and physico-chemical properties of proteins. A bagging method based on Kendall-τ distances is then used to minimise the pairwise disagreements between the ranks of the elements obtained from several different bagging approaches. We apply this method to the protein docking problem using 51 protein complexes from the standard Protein Docking Benchmark. Overall, our approach improves in the ranks of near-native conformation and results in more biologically relevant predictions.


Root Mean Square Deviation Pairwise Disagreement Protein Docking Protein Data Bank Code Docking Algorithm 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Ailon, N., Charikar, M., Newman, A.: Aggregating inconsistent information: Ranking and clustering. J. ACM 55(5) (2008)Google Scholar
  2. 2.
    Azé, J., Roche, M., Sebag, M.: Bagging evolutionary roc-based hypotheses application to terminology extraction. In: Proceedings of ROCML (ROC Analysis in Machine Learning), Bonn, Germany (2005)Google Scholar
  3. 3.
    Berman, H., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T., Weissing, H., Shindyalov, I., Bourne, P.: The protein data bank. Nucleic Acids Research 28, 235–242 (2000)CrossRefGoogle Scholar
  4. 4.
    Bernauer, J., Azé, J., Janin, J., Poupon, A.: A new protein-protein docking scoring function based on interface residue properties. Bioinformatics 5(23), 555–562 (2007), CrossRefGoogle Scholar
  5. 5.
    Betzler, N., Fellows, M.R., Guo, J., Niedermeier, R., Rosamond, F.A.: Fixed-parameter algorithms for kemeny scores. In: Fleischer, R., Xu, J. (eds.) AAIM 2008. LNCS, vol. 5034, pp. 60–71. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  6. 6.
    Blin, G., Crochemore, M., Hamel, S., Vialette, S.: Median of an odd number of permutations. Pure Mathematics and Applications 21(2), 161–175 (2011)zbMATHGoogle Scholar
  7. 7.
    Boissonnat, J.D., Devillers, O., Pion, S., Teillaud, M., Yvinec, M.: Triangulations in CGAL. Comput. Geom. Theory Appl. 22, 5–19 (2002)CrossRefzbMATHGoogle Scholar
  8. 8.
    Bourquard, T., Bernauer, J., Azé, J., Poupon, A.: Comparing Voronoi and Laguerre tessellations in the protein-protein docking context. In: Sixth International Symposium on Voronoi Diagrams (ISVD), pp. 225–232 (2009)Google Scholar
  9. 9.
    Bourquard, T., Bernauer, J., Azé, J., Poupon, A.: A Collaborative Filtering Approach for Protein-Protein Docking Scoring Functions. PLoS ONE 6(4), e18541 (2011), doi:10.1371/journal.pone.0018541CrossRefGoogle Scholar
  10. 10.
    Camacho, C.: Modeling side-chains using molecular dynamics improve recognition of binding region in capri targets. Proteins 60(2), 245–251 (2005)CrossRefGoogle Scholar
  11. 11.
    Devos, D., Russell, R.B.: A more complete, complexed and structured interactome. Current Opinion in Structural Biology 17, 370–377 (2007)CrossRefGoogle Scholar
  12. 12.
    Dominguez, C., Boelens, R., Bonvin, A.: HADDOCK: a protein-protein docking approach based on biochemical or biophysical information. J. Am. Chem. Soc. 125(7), 1731–1737 (2003)CrossRefGoogle Scholar
  13. 13.
    Dwork, C., Kumar, R., Naor, M., Sivakumar, D.: Rank aggregation methods for the Web. In: World Wide Web, pp. 613–622 (2001),
  14. 14.
    Furukawa, H., Singh, S., Mancusso, R., Gouaux, E.: Subunit arrangement and function in nmda receptors. Nature 438(7065), 185–192 (2005)CrossRefGoogle Scholar
  15. 15.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: An update. SIGKDD Explorations 11(1), 10–18 (2009)CrossRefGoogle Scholar
  16. 16.
    Hwang, H., Vreven, T., Janin, J., Weng, Z.: Protein-protein docking benchmark version 4. 0. Proteins 78(15), 3111–3114 (2010)CrossRefGoogle Scholar
  17. 17.
    Janin, J., Henrick, K., Moult, J., Eyck, L., Sternberg, M., Vajda, S., Vakser, I., Wodak, S.: CAPRI: a Critical Assessment of PRedicted Interactions. Proteins 52(1), 2–9 (2003)CrossRefGoogle Scholar
  18. 18.
    Janin, J., Wodak, S.: The Third CAPRI Assessment meeting. Structure 15, 755–759 (2007)CrossRefGoogle Scholar
  19. 19.
    Kendall, M.: A New Measure of Rank Correlation. Biometrika 30(1/2), 81–93 (1938), CrossRefzbMATHGoogle Scholar
  20. 20.
    Kenyon-Mathieu, C., Schudy, W.: How to rank with few errors. In: Johnson, D., Feige, U. (eds.) STOC, pp. 95–103. ACM (2007)Google Scholar
  21. 21.
    Komatsu, K., Kurihara, Y., Iwadate, M., Takeda-Shikata, M.: Evaluation of the third solvent clusters fitting procedure for the prediction of protein-protein interactions based on the results at the capri blind docking study. Proteins 52(1), 15–18 (2003)CrossRefGoogle Scholar
  22. 22.
    Levy, E., Pereira-Leal, J., Chothia, C., Teichmann, S.: 3d complex: a structural classification of protein complexes. PLoS Comput. Biol. 2(11) (2006)Google Scholar
  23. 23.
    Mihalek, J., Res, I., Lichtarge, O.: A structure and evolution-guided monte carlo sequence selection strategy for multiple alignement-based analysis of proteins. Bioinformatics 22(2), 149–156 (2006)CrossRefGoogle Scholar
  24. 24.
    Mintseris, J., Weng, Z.: Atomic contact vectors in protein-protein recognition. Proteins 53(3), 214–216 (2003)CrossRefGoogle Scholar
  25. 25.
    Murzin, A., Brenner, S., Hubbard, T., Chothia, C.: Scop: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol. 247, 536–540 (1995)Google Scholar
  26. 26.
    Ritchie, D.: Evaluation of protein docking predictions using hex 3.1 in capri rounds 1 and 2. Proteins 52(1), 98–106 (2003)CrossRefGoogle Scholar
  27. 27.
    Ritchie, D., Venkatraman, V.: Ultra-fast FFT protein docking on graphics processors. Bioinformatics 26(19), 2398–2405 (2010)CrossRefGoogle Scholar
  28. 28.
    van Zuylen, A., Williamson, D.P.: Deterministic algorithms for rank aggregation and other ranking and clustering problems. In: Kaklamanis, C., Skutella, M. (eds.) WAOA 2007. LNCS, vol. 4927, pp. 260–273. Springer, Heidelberg (2008)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Jérôme Azé
    • 1
  • Thomas Bourquard
    • 2
  • Sylvie Hamel
    • 3
  • Anne Poupon
    • 4
    • 5
    • 6
  • David W. Ritchie
    • 2
  1. 1.Laboratoire de Recherche en Informatique, Bâtiment 650, Équipe Bioinformatique – INRIA AMIB groupUniversité de Paris-SudOrsayFrance
  2. 2.INRIA Nancy-Grand Est, LORIAVandoeuvre-lès-NancyFrance
  3. 3.Université de MontréalMontréalCanada
  4. 4.BIOS group, INRA, UMR85, Unité Physiologie de la Reproduction et des ComportementsNouzillyFrance
  5. 5.CNRS, UMR6175NouzillyFrance
  6. 6.Université François RabelaisToursFrance

Personalised recommendations