Computational Prediction of Short Linear Motifs from Protein Sequences

Edwards, Richard J.; Palopoli, Nicolas

doi:10.1007/978-1-4939-2285-7_6

Richard J. Edwards^4,5,6 &
Nicolas Palopoli⁵

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1268))

3754 Accesses
22 Citations

Abstract

Short Linear Motifs (SLiMs) are functional protein microdomains that typically mediate interactions between a short linear region in one protein and a globular domain in another. SLiMs usually occur in structurally disordered regions and mediate low affinity interactions. Most SLiMs are 3–15 amino acids in length and have 2–5 defined positions, making them highly likely to occur by chance and extremely difficult to identify. Nevertheless, our knowledge of SLiMs and capacity to predict them from protein sequence data using computational methods has advanced dramatically over the past decade. By considering the biological, structural, and evolutionary context of SLiM occurrences, it is possible to differentiate functional instances from chance matches in many cases and to identify new regions of proteins that have the features consistent with a SLiM-mediated interaction. Their simplicity also makes SLiMs evolutionarily labile and prone to independent origins on different sequence backgrounds through convergent evolution, which can be exploited for predicting novel SLiMs in proteins that share a function or interaction partner.

In this review, we explore our current knowledge of SLiMs and how it can be applied to the task of predicting them computationally from protein sequences. Rather than focusing on specific SLiM prediction tools, we provide an overview of the methods available and concentrate on principles that should continue to be paramount even in the light of future developments. We consider the relative merits of using regular expressions or profiles for SLiM discovery and discuss the main considerations for both predicting new instances of known SLiMs, and de novo prediction of novel SLiMs. In particular, we highlight the importance of correctly modelling evolutionary relationships and the probability of false positive predictions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Hardcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Abbreviations

DMI:: Domain-motif interaction
ELM:: Eukaryotic linear motif
FPR:: False positive rate
GO:: Gene ontology
HMM:: Hidden Markov model
IDP:: Intrinsically disordered protein
IDR:: Intrinsically disordered region
LDMS:: (l, d) motif search
MnM:: Minimotif miner
MoRF:: Molecular recognition feature
MST:: Minimum spanning tree
PPI:: Protein-protein interaction
PSSM:: Position-specific scoring matrix
PTM:: Posttranslational modification
Regex:: Regular expression
SLiM:: Short linear motif
TPR:: True positive rate

References

Davey NE, Van Roey K, Weatheritt RJ et al (2012) Attributes of short linear motifs. Mol Biosyst 8(1):268–281
CAS PubMed Google Scholar
Pawson T (1995) Protein modules and signalling networks. Nature 373(6515):573–580
CAS PubMed Google Scholar
Davis BD, Tai PC (1980) The mechanism of protein secretion across membranes. Nature 283(5746):433–438
CAS PubMed Google Scholar
Aasland R, Abrams C, Ampe C et al (2002) Normalization of nomenclature for peptide motifs as ligands of modular protein domains. FEBS Lett 513(1):141–144
CAS PubMed Google Scholar
Puntervoll P, Linding R, Gemund C et al (2003) ELM server: a new resource for investigating short functional sites in modular eukaryotic proteins. Nucleic Acids Res 31(13):3625–3630
CAS PubMed Central PubMed Google Scholar
Pancsa R, Fuxreiter M (2012) Interactions via intrinsically disordered regions: what kind of motifs? IUBMB Life 64(6):513–520
CAS PubMed Google Scholar
Neduva V, Russell RB (2006) Peptides mediating interaction networks: new leads at last. Curr Opin Biotechnol 17(5):465–471
CAS PubMed Google Scholar
Diella F, Haslam N, Chica C et al (2008) Understanding eukaryotic linear motifs and their role in cell signaling and regulation. Front Biosci 13:6580–6603
CAS PubMed Google Scholar
Dinkel H, Van Roey K, Michael S et al (2014) The eukaryotic linear motif resource ELM: 10 years and counting. Nucleic Acids Res 42(1):D259–D266
CAS PubMed Central PubMed Google Scholar
Mi T, Merlin JC, Deverasetty S et al (2012) Minimotif Miner 3.0: database expansion and significantly improved reduction of false-positive predictions from consensus sequences. Nucleic Acids Res 40(Database issue):D252–D260
CAS PubMed Central PubMed Google Scholar
Davey NE, Edwards RJ, Shields DC (2010) Computational identification and analysis of protein short linear motifs. Front Biosci (Landmark Ed) 15:801–825
CAS Google Scholar
Neduva V, Russell RB (2005) Linear motifs: evolutionary interaction switches. FEBS Lett 579(15):3342–3345
CAS PubMed Google Scholar
Van Roey K, Gibson TJ, Davey NE (2012) Motif switches: decision-making in cell regulation. Curr Opin Struct Biol 22(3):378–385
PubMed Google Scholar
Vyas J, Nowling RJ, Maciejewski MW et al (2009) A proposed syntax for Minimotif Semantics, version 1. BMC Genomics 10:360
PubMed Central PubMed Google Scholar
Davey NE, Trave G, Gibson TJ (2011) How viruses hijack cell regulation. Trends Biochem Sci 36(3):159–169
CAS PubMed Google Scholar
Garamszegi S, Franzosa EA, Xia Y (2013) Signatures of pleiotropy, economy and convergent evolution in a domain-resolved map of human-virus protein-protein interaction networks. PLoS Pathog 9(12):e1003778
PubMed Central PubMed Google Scholar
Davey NE, Edwards RJ, Shields DC (2010) Estimation and efficient computation of the true probability of recurrence of short linear protein sequence motifs in unrelated proteins. BMC Bioinform 11:14
Google Scholar
Sigrist CJ, Cerutti L, Hulo N et al (2002) PROSITE: a documented database using patterns and profiles as motif descriptors. Brief Bioinform 3(3):265–274
CAS PubMed Google Scholar
Xia X (2012) Position weight matrix, gibbs sampler, and the associated significance tests in motif characterization and prediction. Scientifica (Cairo) 2012:917540
Google Scholar
Eddy SR (1998) Profile hidden Markov models. Bioinformatics 14(9):755–763
CAS PubMed Google Scholar
Krogh A, Brown M, Mian IS et al (1994) Hidden Markov models in computational biology. Applications to protein modeling. J Mol Biol 235(5):1501–1531
CAS PubMed Google Scholar
Obenauer JC, Cantley LC, Yaffe MB (2003) Scansite 2.0: proteome-wide prediction of cell signaling interactions using short sequence motifs. Nucleic Acids Res 31(13):3635–3641
CAS PubMed Central PubMed Google Scholar
Yoon BJ (2009) Hidden Markov models and their applications in biological sequence analysis. Curr Genomics 10(6):402–415
CAS PubMed Central PubMed Google Scholar
Seiler M, Mehrle A, Poustka A et al (2006) The 3of5 web application for complex and comprehensive pattern matching in protein sequences. BMC Bioinform 7:144
Google Scholar
Davey NE, Haslam NJ, Shields DC et al (2010) SLiMSearch: a webserver for finding novel occurrences of short linear motifs in proteins, incorporating sequence context. Lect Notes Bioinform 6282:50–61
CAS Google Scholar
Meszaros B, Dosztanyi Z, Simon I (2012) Disordered binding regions and linear motifs–bridging the gap between two models of molecular recognition. PLoS One 7(10):e46829
CAS PubMed Central PubMed Google Scholar
Davey NE, Shields DC, Edwards RJ (2009) Masking residues using context-specific evolutionary conservation significantly improves short linear motif discovery. Bioinformatics 25(4):443–450
CAS PubMed Google Scholar
Brown CJ, Takayama S, Campen AM et al (2002) Evolutionary rate heterogeneity in proteins with long disordered regions. J Mol Evol 55(1):104–110
CAS PubMed Google Scholar
Tóth-Petróczy A, Mészáros B, Simon I et al (2008) Assessing conservation of disordered regions in proteins. Open Proteom J 1:46–53
Google Scholar
Fuxreiter M, Tompa P, Simon I (2007) Local structural disorder imparts plasticity on linear motifs. Bioinformatics 23(8):950–956
CAS PubMed Google Scholar
Remaut H, Waksman G (2006) Protein-protein interaction through beta-strand addition. Trends Biochem Sci 31(8):436–444
CAS PubMed Google Scholar
Cino EA, Choy WY, Karttunen M (2013) Conformational biases of linear motifs. J Phys Chem B 117(50):15943–15957
CAS PubMed Google Scholar
Abeln S, Frenkel D (2008) Disordered flanks prevent peptide aggregation. PLoS Comput Biol 4(12):e1000241
PubMed Central PubMed Google Scholar
Sehnal D, Varekova RS, Huber HJ et al (2012) SiteBinder: an improved approach for comparing multiple protein structural motifs. J Chem Inf Model 52(2):343–359
CAS PubMed Google Scholar
Buljan M, Chalancon G, Eustermann S et al (2012) Tissue-specific splicing of disordered segments that embed binding motifs rewires protein interaction networks. Mol Cell 46(6):871–883
CAS PubMed Central PubMed Google Scholar
Weatheritt RJ, Davey NE, Gibson TJ (2012) Linear motifs confer functional diversity onto splice variants. Nucleic Acids Res 40(15):7123–7131
CAS PubMed Central PubMed Google Scholar
Weatheritt RJ, Gibson TJ (2012) Linear motifs: lost in (pre)translation. Trends Biochem Sci 37(8):333–341
CAS PubMed Google Scholar
Wan J, Qian SB (2014) TISdb: a database for alternative translation initiation in mammalian cells. Nucleic Acids Res 42(1):D845–D850
CAS PubMed Central PubMed Google Scholar
Kochetov AV (2008) Alternative translation start sites and hidden coding potential of eukaryotic mRNAs. Bioessays 30(7):683–691
CAS PubMed Google Scholar
UniProt C (2014) Activities at the universal protein resource (UniProt). Nucleic Acids Res 42(1):D191–D198
Google Scholar
Edwards RJ, Davey NE, Shields DC (2007) SLiMFinder: a probabilistic method for identifying over-represented, convergently evolved, short linear motifs in proteins. PLoS One 2(10):e967
PubMed Central PubMed Google Scholar
Davey NE, Edwards RJ, Shields DC (2007) The SLiMDisc server: short, linear motif discovery in proteins. Nucleic Acids Res 35(Web Server issue):W455–W459
PubMed Central PubMed Google Scholar
Flicek P, Amode MR, Barrell D et al (2014) Ensembl 2014. Nucleic Acids Res 42(1):D749–D755
CAS PubMed Central PubMed Google Scholar
Oldfield CJ, Cheng Y, Cortese MS et al (2005) Coupled folding and binding with alpha-helix-forming molecular recognition elements. Biochemistry 44(37):12454–12470
CAS PubMed Google Scholar
Mohan A, Oldfield CJ, Radivojac P et al (2006) Analysis of molecular recognition features (MoRFs). J Mol Biol 362(5):1043–1059
CAS PubMed Google Scholar
Vacic V, Oldfield CJ, Mohan A et al (2007) Characterization of molecular recognition features, MoRFs, and their binding partners. J Proteome Res 6(6):2351–2366
CAS PubMed Central PubMed Google Scholar
Stein A, Aloy P (2008) Contextual specificity in peptide-mediated protein interactions. PLoS One 3(7):e2524
PubMed Central PubMed Google Scholar
Teyra J, Sidhu SS, Kim PM (2012) Elucidation of the binding preferences of peptide recognition modules: SH3 and PDZ domains. FEBS Lett 586(17):2631–2637
CAS PubMed Google Scholar
Liu Y, Woods NT, Kim D et al (2011) Yeast two-hybrid junk sequences contain selected linear motifs. Nucleic Acids Res 39(19):e128
CAS PubMed Central PubMed Google Scholar
Eisenhaber B, Eisenhaber F (2010) Prediction of posttranslational modification of proteins from their amino acid sequence. Methods Mol Biol 609:365–384
CAS PubMed Google Scholar
Trost B, Kusalik A (2011) Computational prediction of eukaryotic phosphorylation sites. Bioinformatics 27(21):2927–2935
CAS PubMed Google Scholar
Sigrist CJ, De Castro E, Langendijk-Genevaux PS et al (2005) ProRule: a new database containing functional and structural information on PROSITE profiles. Bioinformatics 21(21):4060–4066
CAS PubMed Google Scholar
Sigrist CJ, de Castro E, Cerutti L et al (2013) New and continuing developments at PROSITE. Nucleic Acids Res 41(Database issue):D344–D347
CAS PubMed Central PubMed Google Scholar
Letunic I, Doerks T, Bork P (2012) SMART 7: recent updates to the protein domain annotation resource. Nucleic Acids Res 40(Database issue):D302–D305
CAS PubMed Central PubMed Google Scholar
Punta M, Coggill PC, Eberhardt RY et al (2012) The Pfam protein families database. Nucleic Acids Res 40(Database issue):D290–D301
CAS PubMed Central PubMed Google Scholar
Chica C, Labarga A, Gould CM et al (2008) A tree-based conservation scoring method for short linear motifs in multiple alignments of protein sequences. BMC Bioinform 9:229
Google Scholar
Via A, Gould CM, Gemund C et al (2009) A structure filter for the Eukaryotic Linear Motif Resource. BMC Bioinform 10:351
Google Scholar
Weatheritt RJ, Jehl P, Dinkel H et al (2012) iELM–a web server to explore short linear motif-mediated interactions. Nucleic Acids Res 40(Web Server issue):W364–W369
CAS PubMed Central PubMed Google Scholar
Dinkel H, Chica C, Via A et al (2011) Phospho.ELM: a database of phosphorylation sites–update 2011. Nucleic Acids Res 39(Database issue):D261–D267
CAS PubMed Central PubMed Google Scholar
Van Roey K, Dinkel H, Weatheritt RJ et al (2013) The switches.ELM resource: a compendium of conditional regulatory interaction interfaces. Sci Signal 6(269):rs7
PubMed Google Scholar
Jin J, Pawson T (2012) Modular evolution of phosphorylation-based signalling systems. Philos Trans R Soc Lond B Biol Sci 367(1602):2540–2555
CAS PubMed Central PubMed Google Scholar
Songyang Z, Blechner S, Hoagland N et al (1994) Use of an oriented peptide library to determine the optimal substrates of protein kinases. Curr Biol 4(11):973–982
CAS PubMed Google Scholar
Edwards RJ, Davey NE, O’Brien K et al (2012) Interactome-wide prediction of short, disordered protein interaction motifs in humans. Mol Biosyst 8(1):282–295
CAS PubMed Google Scholar
Ashburner M, Ball CA, Blake JA et al (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25(1):25–29
CAS PubMed Central PubMed Google Scholar
Hamosh A, Scott AF, Amberger J et al (2000) Online mendelian inheritance in man (OMIM). Hum Mutat 15(1):57–61
CAS PubMed Google Scholar
Goel R, Harsha HC, Pandey A et al (2012) Human protein reference database and human proteinpedia as resources for phosphoproteome analysis. Mol Biosyst 8(2):453–463
CAS PubMed Central PubMed Google Scholar
Safran M, Dalah I, Alexander J et al (2010) GeneCards Version 3: the human gene integrator. Database (Oxford) 2010:baq020
Google Scholar
Davey NE, Haslam NJ, Shields DC et al (2011) SLiMSearch 2.0: biological context for short linear motifs in proteins. Nucleic Acids Res 39(Web Server issue):W56–W60
CAS PubMed Central PubMed Google Scholar
Edwards RJ, Davey NE, Shields DC (2008) CompariMotif: quick and easy comparisons of sequence motifs. Bioinformatics 24(10):1307–1309
CAS PubMed Google Scholar
Marsico A, Scheubert K, Tuukkanen A et al (2010) MeMotif: a database of linear motifs in alpha-helical transmembrane proteins. Nucleic Acids Res 38(Database issue):D181–D189
CAS PubMed Central PubMed Google Scholar
Neduva V, Linding R, Su-Angrand I et al (2005) Systematic discovery of new recognition peptides mediating protein interaction networks. PLoS Biol 3(12):e405
PubMed Central PubMed Google Scholar
Bailey TL, Boden M, Buske FA et al (2009) MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res 37(Web Server issue):W202–W208
CAS PubMed Central PubMed Google Scholar
Grant CE, Bailey TL, Noble WS (2011) FIMO: scanning for occurrences of a given motif. Bioinformatics 27(7):1017–1018
CAS PubMed Central PubMed Google Scholar
Frith MC, Saunders NF, Kobe B et al (2008) Discovering sequence motifs with arbitrary insertions and deletions. PLoS Comput Biol 4(4):e1000071
PubMed Central PubMed Google Scholar
Bailey TL, Gribskov M (1997) Score distributions for simultaneous matching to multiple motifs. J Comput Biol 4(1):45–59
CAS PubMed Google Scholar
de Castro E, Sigrist CJ, Gattiker A et al (2006) ScanProsite: detection of PROSITE signature matches and ProRule-associated functional and structural residues in proteins. Nucleic Acids Res 34(Web Server issue):W362–W365
PubMed Central PubMed Google Scholar
Davey NE, Shields DC, Edwards RJ (2006) SLiMDisc: short, linear motif discovery, correcting for common evolutionary descent. Nucleic Acids Res 34(12):3546–3554
CAS PubMed Central PubMed Google Scholar
Peng ZL, Kurgan L (2012) Comprehensive comparative assessment of in-silico predictors of disordered regions. Curr Protein Pept Sci 13(1):6–18
CAS PubMed Google Scholar
Deng X, Eickholt J, Cheng J (2012) A comprehensive overview of computational protein disorder prediction methods. Mol Biosyst 8(1):114–121
CAS PubMed Central PubMed Google Scholar
Dosztanyi Z, Csizmok V, Tompa P et al (2005) IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content. Bioinformatics 21(16):3433–3434
CAS PubMed Google Scholar
Haslam NJ, Shields DC (2012) Profile-based short linear protein motif discovery. BMC Bioinform 13:104
Google Scholar
Sickmeier M, Hamilton JA, LeGall T et al (2007) DisProt: the database of disordered proteins. Nucleic Acids Res 35(Database issue):D786–D793
CAS PubMed Central PubMed Google Scholar
Chen JW, Romero P, Uversky VN et al (2006) Conservation of intrinsic disorder in protein domains and families: I. A database of conserved predicted disordered regions. J Proteome Res 5(4):879–887
PubMed Central PubMed Google Scholar
Tompa P, Fuxreiter M, Oldfield CJ et al (2009) Close encounters of the third kind: disordered domains and the interactions of proteins. Bioessays 31(3):328–335
CAS PubMed Google Scholar
Williams RW, Xue B, Uversky VN et al (2013) Distribution and cluster analysis of predicted intrinsically disordered protein Pfam domains. Intrins Disord Prot 1:e25724
Google Scholar
Schaeffer RD, Jonsson AL, Simms AM et al (2011) Generation of a consensus protein domain dictionary. Bioinformatics 27(1):46–54
CAS PubMed Central PubMed Google Scholar
Towse CL, Daggett V (2012) When a domain is not a domain, and why it is important to properly filter proteins in databases: conflicting definitions and fold classification systems for structural domains make filtering of such databases imperative. Bioessays 34(12):1060–1069
CAS PubMed Central PubMed Google Scholar
Linding R, Russell RB, Neduva V et al (2003) GlobPlot: exploring protein sequences for globularity and disorder. Nucleic Acids Res 31(13):3701–3708
CAS PubMed Central PubMed Google Scholar
Mosca R, Ceol A, Stein A et al (2014) 3did: a catalog of domain-based interactions of known three-dimensional structure. Nucleic Acids Res 42(1):D374–D379
CAS PubMed Central PubMed Google Scholar
Stein A, Aloy P (2010) Novel peptide-mediated interactions derived from high-resolution 3-dimensional structures. PLoS Comput Biol 6(5):e1000789
PubMed Central PubMed Google Scholar
Brannetti B, Helmer-Citterich M (2003) iSPOT: a web tool to infer the interaction specificity of families of protein modules. Nucleic Acids Res 31(13):3709–3711
CAS PubMed Central PubMed Google Scholar
Trabuco LG, Lise S, Petsalaki E et al (2012) PepSite: prediction of peptide-binding sites from protein surfaces. Nucleic Acids Res 40(Web Server issue):W423–W427
CAS PubMed Central PubMed Google Scholar
Perrodou E, Chica C, Poch O et al (2008) A new protein linear motif benchmark for multiple sequence alignment software. BMC Bioinform 9:213
Google Scholar
Sayers EW, Barrett T, Benson DA et al (2011) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 39(Database issue):D38–D51
CAS PubMed Central PubMed Google Scholar
Balla S, Thapar V, Verma S et al (2006) Minimotif Miner: a tool for investigating protein function. Nat Methods 3(3):175–177
CAS PubMed Google Scholar
Dinkel H, Sticht H (2007) A computational strategy for the prediction of functional linear peptide motifs in proteins. Bioinformatics 23(24):3297–3303
CAS PubMed Google Scholar
Davey NE, Cowan JL, Shields DC et al (2012) SLiMPrints: conservation-based discovery of functional motif fingerprints in intrinsically disordered protein regions. Nucleic Acids Res 40(21):10628–10641
CAS PubMed Central PubMed Google Scholar
Chica C, Diella F, Gibson TJ (2009) Evidence for the concerted evolution between short linear protein motifs and their flanking regions. PLoS One 4(7):e6052
PubMed Central PubMed Google Scholar
O’Brien KT, Haslam NJ, Shields DC (2013) SLiMScape: a protein short linear motif analysis plugin for Cytoscape. BMC Bioinform 14:224
Google Scholar
Davey NE, Haslam NJ, Shields DC et al (2010) SLiMFinder: a web server to find novel, significantly over-represented, short protein motifs. Nucleic Acids Res 38(Web Server issue):W534–W539
CAS PubMed Central PubMed Google Scholar
Plewczynski D, Basu S, Saha I (2012) AMS 4.0: consensus prediction of post-translational modifications in protein sequences. Amino Acids 43(2):573–582
CAS PubMed Central PubMed Google Scholar
Kerrien S, Aranda B, Breuza L et al (2012) The IntAct molecular interaction database in 2012. Nucleic Acids Res 40(Database issue):D841–D846
CAS PubMed Central PubMed Google Scholar
Altschul SF, Madden TL, Schaffer AA et al (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25(17):3389–3402
CAS PubMed Central PubMed Google Scholar
Via A, Gherardini PF, Ferraro E et al (2007) False occurrences of functional motifs in protein sequences highlight evolutionary constraints. BMC Bioinform 8:68
Google Scholar
Nguyen Ba AN, Yeh BJ, van Dyk D et al (2012) Proteome-wide discovery of evolutionary conserved sequences in disordered regions. Sci Signal 5(215):rs1
PubMed Google Scholar
Fang C, Noguchi T, Tominaga D et al (2013) MFSPSSMpred: identifying short disorder-to-order binding regions in disordered proteins based on contextual local evolutionary conservation. BMC Bioinform 14:300
Google Scholar
Chou MF, Schwartz D (2011) Biological sequence motif discovery using motif-x. Curr Protoc Bioinform Chapter 13, Unit 13 15–24
Google Scholar
Orchard S (2012) Molecular interaction databases. Proteomics 12(10):1656–1662
CAS PubMed Google Scholar
Jonassen I (1997) Efficient discovery of conserved patterns using a pattern graph. Comput Appl Biosci 13(5):509–522
CAS PubMed Google Scholar
Jonassen I, Collins JF, Higgins DG (1995) Finding flexible patterns in unaligned protein sequences. Protein Sci 4(8):1587–1595
CAS PubMed Central PubMed Google Scholar
Neuwald AF, Green P (1994) Detecting patterns in protein sequences. J Mol Biol 239(5):698–712
CAS PubMed Google Scholar
Rigoutsos I, Floratos A (1998) Combinatorial pattern discovery in biological sequences: the TEIRESIAS algorithm. Bioinformatics 14(1):55–67
CAS PubMed Google Scholar
Neduva V, Russell RB (2006) DILIMOT: discovery of linear motifs in proteins. Nucleic Acids Res 34(Web Server issue):W350–W355
CAS PubMed Central PubMed Google Scholar
Bailey TL, Elkan C (1994) Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc Int Conf Intell Syst Mol Biol 2:28–36
CAS PubMed Google Scholar
Lawrence CE, Reilly AA (1990) An expectation maximization (EM) algorithm for the identification and characterization of common sites in unaligned biopolymer sequences. Proteins 7(1):41–51
CAS PubMed Google Scholar
Do CB, Batzoglou S (2008) What is the expectation maximization algorithm? Nat Biotechnol 26(8):897–899
CAS PubMed Google Scholar
Down TA, Hubbard TJ (2005) NestedMICA: sensitive inference of over-represented motifs in nucleic acid sequence. Nucleic Acids Res 33(5):1445–1453
CAS PubMed Central PubMed Google Scholar
Dogruel M, Down TA, Hubbard TJ (2008) NestedMICA as an ab initio protein motif discovery tool. BMC Bioinform 9:19
Google Scholar
Dinh H, Rajasekaran S (2013) PMS: a panoptic motif search tool. PLoS One 8(12):e80660
PubMed Central PubMed Google Scholar
Dinh H, Rajasekaran S, Davila J (2012) qPMS7: a fast algorithm for finding (l, d)-motifs in DNA and protein sequences. PLoS One 7(7):e41425
CAS PubMed Central PubMed Google Scholar
Tan SH, Hugo W, Sung WK et al (2006) A correlated motif approach for finding short linear motifs from protein interaction networks. BMC Bioinform 7:502
Google Scholar
Leung HC, Siu MH, Yiu SM et al (2009) Clustering-based approach for predicting motif pairs from protein interaction data. J Bioinform Comput Biol 7(4):701–716
CAS PubMed Google Scholar
Boyen P, Van Dyck D, Neven F et al (2011) SLIDER: a generic metaheuristic for the discovery of correlated motifs in protein-protein interaction networks. IEEE/ACM Trans Comput Biol Bioinform 8(5):1344–1357
CAS PubMed Google Scholar
Lieber DS, Elemento O, Tavazoie S (2010) Large-scale discovery and characterization of protein regulatory motifs in eukaryotes. PLoS One 5(12):e14444
CAS PubMed Central PubMed Google Scholar
Dosztanyi Z, Meszaros B, Simon I (2009) ANCHOR: web server for predicting protein binding regions in disordered proteins. Bioinformatics 25(20):2745–2746
CAS PubMed Central PubMed Google Scholar
Meszaros B, Simon I, Dosztanyi Z (2009) Prediction of protein binding regions in disordered proteins. PLoS Comput Biol 5(5):e1000376
PubMed Central PubMed Google Scholar
Cheng Y, Oldfield CJ, Meng J et al (2007) Mining alpha-helix-forming molecular recognition features with cross species sequence alignments. Biochemistry 46(47):13468–13477
CAS PubMed Central PubMed Google Scholar
Disfani FM, Hsu WL, Mizianty MJ et al (2012) MoRFpred, a computational tool for sequence-based prediction and characterization of short disorder-to-order transitioning binding regions in proteins. Bioinformatics 28(12):i75–i83
CAS PubMed Central PubMed Google Scholar
Mooney C, Pollastri G, Shields DC et al (2012) Prediction of short linear protein binding regions. J Mol Biol 415(1):193–204
CAS PubMed Google Scholar
Rose PW, Bi C, Bluhm WF et al (2013) The RCSB Protein Data Bank: new resources for research and education. Nucleic Acids Res 41(Database issue):D475–D482
CAS PubMed Central PubMed Google Scholar
Betel D, Breitkreuz KE, Isserlin R et al (2007) Structure-templated predictions of novel protein interactions from sequence information. PLoS Comput Biol 3(9):1783–1789
CAS PubMed Google Scholar
Hugo W, Sung WK, Ng SK (2013) Discovering interacting domains and motifs in protein-protein interactions. Methods Mol Biol 939:9–20
CAS PubMed Google Scholar
Gibson TJ (2009) Cell regulation: determined to signal discrete cooperation. Trends Biochem Sci 34(10):471–482
CAS PubMed Google Scholar
Lam HY, Kim PM, Mok J et al (2010) MOTIPS: automated motif analysis for predicting targets of modular protein domains. BMC Bioinform 11:243
Google Scholar
Schwartz D, Gygi SP (2005) An iterative statistical approach to the identification of protein phosphorylation motifs from large-scale data sets. Nat Biotechnol 23(11):1391–1398
CAS PubMed Google Scholar
Schwartz D, Chou MF, Church GM (2009) Predicting protein post-translational modifications using meta-analysis of proteome scale data sets. Mol Cell Proteomics 8(2):365–379
CAS PubMed Central PubMed Google Scholar
Villen J, Beausoleil SA, Gerber SA et al (2007) Large-scale phosphorylation analysis of mouse liver. Proc Natl Acad Sci U S A 104(5):1488–1493
CAS PubMed Central PubMed Google Scholar
Wilson-Grady JT, Villen J, Gygi SP (2008) Phosphoproteome analysis of fission yeast. J Proteome Res 7(3):1088–1097
CAS PubMed Google Scholar
Zhai B, Villen J, Beausoleil SA et al (2008) Phosphoproteome analysis of Drosophila melanogaster embryos. J Proteome Res 7(4):1675–1682
CAS PubMed Central PubMed Google Scholar
Edwards RJ. SLiMSuite software package. 2013 [cited 25/1/14]; Available from: http://www.southampton.ac.uk/~re1u06/software/packages/slimsuite/

Download references

Author information

Authors and Affiliations

School of Biotechnology and Biomolecular Sciences, University of New South Wales, Room 263B, Biological Sciences Building, Building D26, Sydney, NSW, 2052, Australia
Richard J. Edwards
Centre for Biological Sciences, University of Southampton, Southampton, UK
Richard J. Edwards & Nicolas Palopoli
Institute for Life Sciences, University of Southampton, Southampton, UK
Richard J. Edwards

Authors

Richard J. Edwards
View author publications
You can also search for this author in PubMed Google Scholar
Nicolas Palopoli
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Richard J. Edwards .

Editor information

Editors and Affiliations

Center of Bioinformatics (COBI), School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu, China
Peng Zhou
Center of Bioinformatics (COBI), School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu, China
Jian Huang

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Edwards, R.J., Palopoli, N. (2015). Computational Prediction of Short Linear Motifs from Protein Sequences. In: Zhou, P., Huang, J. (eds) Computational Peptidology. Methods in Molecular Biology, vol 1268. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-2285-7_6

Download citation

DOI: https://doi.org/10.1007/978-1-4939-2285-7_6
Published: 11 December 2014
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-2284-0
Online ISBN: 978-1-4939-2285-7
eBook Packages: Springer Protocols

Publish with us

Policies and ethics