Rab GTPases pp 17-28 | Cite as

Bioinformatic Approaches to Identifying and Classifying Rab Proteins

  • Yoan Diekmann
  • José B. Pereira-LealEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 1298)


The bioinformatic annotation of Rab GTPases is important, for example, to understand the evolution of the endomembrane system. However, Rabs are particularly challenging for standard annotation pipelines because they are similar to other small GTPases and form a large family with many paralogous subfamilies. Here, we describe a bioinformatic annotation pipeline specifically tailored to Rab GTPases. It proceeds in two steps: first, Rabs are distinguished from other proteins based on GTPase-specific motifs, overall sequence similarity to other Rabs, and the occurrence of Rab-specific motifs. Second, Rabs are classified taking either a more accurate but slower phylogenetic approach or a slightly less accurate but much faster bioinformatic approach. All necessary steps can either be performed locally or using the referenced online tools. An implementation of a slightly more involved version of the pipeline presented here is available at

Key words

Bioinformatics RabF motifs RabSF regions Subfamily classification Evolution 



We thank Mark Gouw for including the links to the sequence and motif files on the Rabifier website. This work was supported by a grant from Fundação para a Ciência e Tecnologia (PTDC/EBB-BIO/119006/2010)


  1. 1.
    Diekmann Y, Seixas E, Gouw M et al (2011) Thousands of Rab GTPases for the cell biologist. PLoS Comput Biol 7:e1002217. doi: 10.1371/journal.pcbi.1002217 CrossRefPubMedCentralPubMedGoogle Scholar
  2. 2.
    Kloepper TH, Kienle N, Fasshauer D, Munro S (2012) Untangling the evolution of Rab G proteins: implications of a comprehensive genomic analysis. BMC Biol 10:71. doi: 10.1186/1741-7007-10-71 CrossRefGoogle Scholar
  3. 3.
    Eliáš M, Brighouse A, Gabernet-Castello C et al (2012) Sculpting the endomembrane system in deep time: high resolution phylogenetics of Rab GTPases. J Cell Sci 125:2500–2508. doi: 10.1242/jcs.101378 CrossRefPubMedCentralPubMedGoogle Scholar
  4. 4.
    Pereira-Leal JB (2008) The Ypt/Rab family and the evolution of trafficking in fungi. Traffic 9:27–38. doi: 10.1111/j.1600-0854.2007.00667.x CrossRefPubMedGoogle Scholar
  5. 5.
    Shintani M, Tada M, Kobayashi T et al (2007) Characterization of Rab45/RASEF containing EF-hand domain and a coiled-coil motif as a self-associating GTPase. Biochem Biophys Res Commun 357:661–667. doi: 10.1016/j.bbrc.2007.03.206 CrossRefPubMedGoogle Scholar
  6. 6.
    Leipe DD, Wolf YI, Koonin EV, Aravind L (2002) Classification and evolution of P-loop GTPases and related ATPases. J Mol Biol 317:41–72. doi: 10.1006/jmbi.2001.5378 CrossRefPubMedGoogle Scholar
  7. 7.
    Pereira-Leal JB, Seabra MC (2000) The mammalian Rab family of small GTPases: definition of family and subfamily sequence motifs suggests a mechanism for functional specificity in the Ras superfamily. J Mol Biol 301:1077–1087. doi: 10.1006/jmbi.2000.4010 CrossRefPubMedGoogle Scholar
  8. 8.
    de Lima Morais DA, Fang H, Rackham OJL et al (2011) SUPERFAMILY 1.75 including a domain-centric gene ontology method. Nucleic Acids Res 39:D427–D434. doi: 10.1093/nar/gkq1130 CrossRefPubMedCentralPubMedGoogle Scholar
  9. 9.
    Camacho C, Coulouris G, Avagyan V et al (2009) BLAST+: architecture and applications. BMC Bioinformatics 10:421. doi: 10.1186/1471-2105-10-421 CrossRefPubMedCentralPubMedGoogle Scholar
  10. 10.
    Bailey TL, Gribskov M (1998) Combining evidence using p-values: application to sequence homology searches. Bioinformatics 14:48–54. doi: 10.1093/bioinformatics/14.1.48 CrossRefPubMedGoogle Scholar
  11. 11.
    Bailey TL, Bodén M, Buske FA et al (2009) MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res 37:W202–W208. doi: 10.1093/nar/gkp335 CrossRefPubMedCentralPubMedGoogle Scholar
  12. 12.
    Katoh K, Standley DM (2013) MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol 30:772–780. doi: 10.1093/molbev/mst010 CrossRefPubMedCentralPubMedGoogle Scholar
  13. 13.
    Löytynoja A, Goldman N (2005) An algorithm for progressive multiple alignment of sequences with insertions. PNAS 102:10557–10562. doi: 10.1073/pnas.0409137102 CrossRefPubMedCentralPubMedGoogle Scholar
  14. 14.
    Price MN, Dehal PS, Arkin AP (2010) FastTree 2—approximately maximum-likelihood trees for large alignments. PLoS One 5:e9490. doi: 10.1371/journal.pone.0009490 CrossRefPubMedCentralPubMedGoogle Scholar
  15. 15.
    Guindon S, Dufayard J-F, Lefort V et al (2010) New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst Biol 59:307–321. doi: 10.1093/sysbio/syq010 CrossRefPubMedGoogle Scholar
  16. 16.
    Finn RD, Bateman A, Clements J et al (2014) Pfam: the protein families database. Nucleic Acids Res 42:D222–D230. doi: 10.1093/nar/gkt1223 CrossRefPubMedCentralPubMedGoogle Scholar
  17. 17.
    Andreeva A, Howorth D, Chandonia J-M et al (2008) Data growth and its impact on the SCOP database: new developments. Nucleic Acids Res 36:D419–D425. doi: 10.1093/nar/gkm993 CrossRefPubMedCentralPubMedGoogle Scholar
  18. 18.
    Löytynoja A, Goldman N (2010) webprank: a phylogeny-aware multiple sequence aligner with interactive alignment browser. BMC Bioinformatics 11:579. doi: 10.1186/1471-2105-11-579 CrossRefPubMedCentralPubMedGoogle Scholar
  19. 19.
    Cock PJA, Antao T, Chang JT et al (2009) Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics 25:1422–1423. doi: 10.1093/bioinformatics/btp163 CrossRefPubMedCentralPubMedGoogle Scholar
  20. 20.
    Stajich JE, Block D, Boulez K et al (2002) The Bioperl toolkit: Perl modules for the life sciences. Genome Res 12:1611–1618. doi: 10.1101/gr.361602 CrossRefPubMedCentralPubMedGoogle Scholar
  21. 21.
    Eddy SR (1998) Profile hidden Markov models. Bioinformatics 14:755–763. doi: 10.1093/bioinformatics/14.9.755 CrossRefPubMedGoogle Scholar
  22. 22.
    Gough J, Chothia C (2002) SUPERFAMILY: HMMs representing all proteins of known structure. SCOP sequence searches, alignments and genome assignments. Nucleic Acids Res 30:268–272. doi: 10.1093/nar/30.1.268 CrossRefPubMedCentralPubMedGoogle Scholar
  23. 23.
    Benson DA, Clark K, Karsch-Mizrachi I et al (2014) GenBank. Nucleic Acids Res 42:D32–D37. doi: 10.1093/nar/gkt1030 CrossRefPubMedCentralPubMedGoogle Scholar
  24. 24.
    Berman HM, Westbrook J, Feng Z et al (2000) The protein data bank. Nucleic Acids Res 28:235–242. doi: 10.1093/nar/28.1.235 CrossRefPubMedCentralPubMedGoogle Scholar
  25. 25.
    UniProt Consortium (2014) Activities at the universal protein resource (UniProt). Nucleic Acids Res 42:D191–D198. doi: 10.1093/nar/gkt1140 CrossRefGoogle Scholar
  26. 26.
    Wu CH, Yeh L-SL, Huang H et al (2003) The protein information resource. Nucleic Acids Res 31:345–347. doi: 10.1093/nar/gkg040 CrossRefPubMedCentralPubMedGoogle Scholar
  27. 27.
    Bailey TL, Elkan C (1994) Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc Int Conf Intell Syst Mol Biol 2:28–36PubMedGoogle Scholar
  28. 28.
    Nei M, Rooney AP (2005) Concerted and birth-and-death evolution of multigene families. Annu Rev Genet 39:121–152. doi: 10.1146/annurev.genet.39.073003.112240 CrossRefPubMedCentralPubMedGoogle Scholar
  29. 29.
    Moore I, Schell J, Palme K (1995) Subclass-specific sequence motifs identified in Rab GTPases. Trends Biochem Sci 20:10–12CrossRefPubMedGoogle Scholar
  30. 30.
    Pfeffer SR (2005) Structural clues to Rab GTPase functional diversity. J Biol Chem 280:15485–15488. doi: 10.1074/jbc.R500003200 CrossRefPubMedGoogle Scholar
  31. 31.
    Khan AR, Ménétrey J (2013) Structural biology of Arf and Rab GTPases’ effector recruitment and specificity. Structure 21:1284–1297. doi: 10.1016/j.str.2013.06.016 CrossRefPubMedGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  1. 1.Research Department of Genetics, Evolution and EnvironmentUniversity College LondonLondonUK
  2. 2.Instituto Gulbenkian de CiênciaOeirasPortugal

Personalised recommendations