SuMo: A Tool for Protein Function Inference Based on 3D Structures Comparisons

Part of the Focus on Structural Biology book series (FOSB, volume 8)


The prediction of important residues for binding/recognition sites in protein 3D structures is still a matter of challenge. Indeed, binding sites recognition is generally based on geometry often combined with physico-chemical properties of the site since the conformation, size and chemical composition of the protein surface are all relevant for the interaction with a specific ligand. In our group, we designed an innovative bioinformatics method called SuMo in order to detect similar 3-dimensional (3D) sites in proteins (Jambon et al. Protein-Struct Funct Genet 52:137–145, 2003). This approach allowed the comparison of protein structures or substructures, and detected local spatial similarities: the main advantage of the method is its independence for both amino acid sequences and backbone structures. In contrast to already existing tools, the basis for this method is a representation of the protein structure by a set of stereo chemical groups that are defined independently from the notion of amino acid. An efficient heuristics for finding similarities has been developed which uses graphs of triangles of chemical groups to represent the protein structures. The SuMo (Surfing the Molecules) program allows the dynamic definition of chemical groups, the selection of sites in the proteins, and the management and screening of databases. The basic principle of SuMo has been used in several recent studies (Sperandio et al. J Cheml Inf Model 47:1097–1110, 2007) (Doppelt-Azeroual et al. Protein Sci 19:847–867, 2010). In order to give access to the SuMo tool, we proposed a web server (Jambon et al. Bioinformatics 21:3929–3930, 2005) reachable at This chapter will describe the main rationale we initially took for designing the first release of SuMo. In addition, we propose a completely new set of parameters best suitable for proteins and finally, we illustrate its power with several biological examples. Two of them dealing with serine proteases and lectins are given for a comparison purpose. The first two examples illustrate the capability of SuMo to deal with completely opposite modes of evolution i.e. convergence and divergence. A new biological application dealing with betalactame binding protein PBB molecules is also presented.


Proteins Structural bioinformatics 3D structure Physico-chemical groups 3D sites Annotation Triangle form Delta-plus Delta-minus Glycine polar Hydrophobic aliphatic Carbon alpha Hydrophobic aromatic Target structure Objects Proteases Isomerases Lectins Betalactam Penicillin drug Cephalosporin drug Ceftazidime Serine proteases Protein-protein interaction SuMo 



Thanks are due to Martin Jambon as the main author of the original SuMo program written in OCAML.


  1. Altschul SF, Madden TL, Schaffer AA, Zhang JH, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402PubMedCrossRefGoogle Scholar
  2. Ballester PJ, Richards WG (2007) Ultrafast shape recognition to search compound databases for similar molecular shapes. J Comput Chem 28:1711–1723PubMedCrossRefGoogle Scholar
  3. Banerjee RDK, Ravishankar R, Suguna K, Surolia A, Vijayan M (1996) Conformation, protein-carbohydrate interactions and a novel subunit association in the refined structure of peanut lectin-lactose complex. J Mol Biol 259:281–296PubMedCrossRefGoogle Scholar
  4. Bertolazzi P, Guerra C, Liuzzi G (2010) A global optimization algorithm for protein surface alignment. BMC Bioinformatics 11:488PubMedCrossRefGoogle Scholar
  5. Capra JA, Singh M (2007) Predicting functionally important residues from sequence conservation. Bioinformatics 23:1875–1882PubMedCrossRefGoogle Scholar
  6. Capra JA, Laskowski RA, Thornton JM, Singh M, Funkhouser TA (2009) Predicting protein ligand binding sites by combining evolutionary sequence conservation and 3D structure. PLoS Comput Biol 5:e1000585PubMedCrossRefGoogle Scholar
  7. Cars O, Molstad S, Melander A (2001) Variation in antibiotic use in the European Union. Lancet 357:1851–1853PubMedCrossRefGoogle Scholar
  8. Chen Y, Shoichet B, Bonnet R (2005) Structure, function, and inhibition along the reaction coordinate of CTX-M beta-lactamases. J Am Chem Soc 127:5423–5434PubMedCrossRefGoogle Scholar
  9. Coenen S, Ferech M, Dvorakova K, Hendrickx E, Suetens C, Goossens H (2006) European surveillance of antimicrobial consumption (ESAC): outpatient cephalosporin use in Europe. J Antimicrob Chemother 58:413–417PubMedCrossRefGoogle Scholar
  10. Doppelt-Azeroual O, Delfaud F, Moriaud F, de Brevern AG (2010) Fast and automated functional classification with MED-SuMo: an application on purine-binding proteins. Protein Sci 19:847–867PubMedCrossRefGoogle Scholar
  11. Erdin S, Ward RM, Venner E, Lichtarge O (2010) Evolutionary trace annotation of protein function in the structural proteome. J Mol Biol 396:1451–1473PubMedCrossRefGoogle Scholar
  12. Ferech M, Coenen S, Dvorakova K, Hendrickx E, Suetens C, Goossens H (2006) European surveillance of antimicrobial consumption (ESAC): outpatient penicillin use in Europe. J Antimicrob Chemother 58:408–412PubMedCrossRefGoogle Scholar
  13. Gordon EJ, Mouz N, Duee E, Dideberg O (2000) The crystal structure of the penicillin-binding protein 2x from Streptococcus pneumoniae and its acyl-enzyme form: implication in drug resistance. J Mol Biol 299(2):477–485PubMedCrossRefGoogle Scholar
  14. Hoffmann B, Zaslavskiy M, Vert JP, Stoven V (2010) A new protein binding pocket similarity measure based on comparison of clouds of atoms in 3D: application to ligand prediction. BMC Bioinformatics 11(1):99PubMedCrossRefGoogle Scholar
  15. Holm L, Sander C (1997) Dali/FSSP classification of three-dimensional protein folds. Nucleic Acids Res 25:231–234PubMedCrossRefGoogle Scholar
  16. Hulo N, Bairoch A, Bulliard V, Cerutti L, Cuche BA, de Castro E, Lachaize C, Langendijk-Genevaux PS, Sigrist CJA (2008) The 20 years of PROSITE. Nucleic Acids Res 36:D245–D249PubMedCrossRefGoogle Scholar
  17. Jambon M, Imberty A, Deleage G, Geourjon C (2003) A new bioinformatic approach to detect common 3D sites in protein structures. Protein-Struct Funct Genet 52:137–145CrossRefGoogle Scholar
  18. Jambon M, Andrieu O, Combet C, Deleage G, Delfaud F, Geourjon C (2005) The SuMo server: 3D search for protein functional sites. Bioinformatics 21:3929–3930PubMedCrossRefGoogle Scholar
  19. Janin J (2010) Protein-protein docking tested in blind predictions: the CAPRI experiment. Mol Biosyst 6:2351–2362PubMedCrossRefGoogle Scholar
  20. Janin J, Henrick K, Moult J, Ten Eyck L, Sternberg MJE, Vajda S, Vasker I, Wodak SJ (2003) CAPRI: a Critical Assessment of PRedicted Interactions. Protein-Struct Funct Bioinform 52:2–9CrossRefGoogle Scholar
  21. Kashima A, Inoue Y, Sugio S, Maeda I, Nose T, Shimohigashi Y (1998) X-ray crystal structure of a dipeptide-chymotrypsin complex in an inhibitory interaction. Eur J Biochem 255:12–23PubMedCrossRefGoogle Scholar
  22. Kristensen DM, Ward RM, Lisewski AM, Erdin S, Chen BY, Fofanov VY, Kimmel M, Kavraki LE, Lichtarge O (2008) Prediction of enzyme function based on 3D templates of evolutionarily important amino acids. BMC Bioinformatics 9(1):17PubMedCrossRefGoogle Scholar
  23. Mathews II, Vanderhoff-Hanaver P, Castellino FJ, Tulinsky A (1996) Crystal structures of the recombinant kringle 1 domain of human plasminogen in complexes with the ligands epsilon-aminocaproic acid and trans-4-(aminomethyl)cyclohexane-1-carboxylic acid. Biochemistry 35:2567–2576PubMedCrossRefGoogle Scholar
  24. Moriaud F, Doppelt-Azeroual O, Martin L, Oguievetskaia K, Koch K, Vorotyntsev A, Adcock SA, Delfaud F (2009) Computational fragment-based approach at PDB scale by protein local similarity. J Chem Inf Model 49:280–294PubMedCrossRefGoogle Scholar
  25. Pearson WR (1991) Searching protein-sequence libraries – comparison of the sensitivity and selectivity of the smith-waterman and fasta algorithms. Genomics 11:635–650PubMedCrossRefGoogle Scholar
  26. Reisen F, Weisel M, Kriegl JM, Schneider G (2010) Self-organizing fuzzy graphs for structure-based comparison of protein pockets. J Proteome Res 9:6498–6510PubMedCrossRefGoogle Scholar
  27. Sael L, La D, Li B, Rustamov R, Kihara D (2008) Rapid comparison of properties on protein surface. Protein-Struct Funct Bioinform 73:1–10CrossRefGoogle Scholar
  28. Schalon C, Surgand JS, Kellenberger E, Rognan D (2008) A simple and fuzzy method to align and compare druggable ligand-binding sites. Protein-Struct Funct Bioinform 71:1755–1778CrossRefGoogle Scholar
  29. Shulman-Peleg A, Mintz S, Nussinov R, Wolfson HJ (2004) Protein-protein interfaces: recognition of similar spatial and chemical organizations. Algorithm Bioinform Proc 3240:194–205CrossRefGoogle Scholar
  30. Sigrist CJA, Cerutti L, de Castro E, Langendijk-Genevaux PS, Bulliard V, Bairoch A, Hulo N (2010) PROSITE, a protein domain database for functional characterization and annotation. Nucleic Acids Res 38:D161–D166PubMedCrossRefGoogle Scholar
  31. Sonavane S, Chakrabarti P (2010) Prediction of active site cleft using support vector machines. J Chem Inf Model 50:2266–2273PubMedCrossRefGoogle Scholar
  32. Sperandio O, Andrieu O, Miteva MA, Vo MQ, Souaille M, Delfaud F, Villoutreix BO (2007) MED-SuMoLig: a new ligand-based screening tool for efficient scaffold hopping. J Chem Inf Model 47:1097–1110PubMedCrossRefGoogle Scholar
  33. Tripos (2010) Sybyl X. St. Louis, MO 63144–2319 USA, Tripos IncGoogle Scholar
  34. Venkatraman V, Sael L, Kihara D (2009) Potential for protein surface shape analysis using spherical harmonics and 3D Zernike descriptors. Cell Biochem Biophys 54:23–32PubMedCrossRefGoogle Scholar
  35. Via A, Ferre F, Brannetti B, Helmer-Citterich M (2000) Protein surface similarities: a survey of methods to describe and compare protein surfaces. Cell Mol Life Sci 57:1970–1977PubMedCrossRefGoogle Scholar
  36. Wallace AC, Borkakoti N, Thornton JM (1997) TESS: a geometric hashing algorithm for deriving 3D coordinate templates for searching structural databases. Application to enzyme active sites. Protein Sci 6:2308–2323PubMedCrossRefGoogle Scholar
  37. Ward RM, Venner E, Daines B, Murray S, Erdin S, Kristensen DM, Lichtarge O (2009) Evolutionary trace annotation server: automated enzyme function prediction in protein structures using 3D templates. Bioinformatics 25:1426–1427PubMedCrossRefGoogle Scholar
  38. Weskamp N, Kuhn D, Hullermeier E, Klebe G (2004) Efficient similarity search in protein structure databases by k-clique hashing. Bioinformatics 20:1522–1526PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2013

Authors and Affiliations

  1. 1.Université Lyon 1, CNRS, UMR 5086; Bases Moléculaires et Structurales des Systèmes InfectieuxLyonFrance

Personalised recommendations