Skip to main content
Book cover

mRNA Decay pp 111–129Cite as

Integration of ENCODE RNAseq and eCLIP Data Sets

  • Protocol
  • First Online:

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1720))

Abstract

During the last decade, the study of mRNA decay has largely benefited from an increasing number of high-throughput assays that emerged from developments in next generation sequencing (NGS) technologies as well as mass spectrometry. While assay-specific data analysis is often reported and software made available many researchers struggle with the overwhelming challenge of integrating data from diverse assays, different sources, and of different formats.

We here use Python, R, and bash to analyze and integrate RNAseq and eCLIP data publicly available from ENCODE. Annotation is performed with biomart, motif analysis with MEME and finally a functional enrichment analysis using DAVID. This analysis is centered on KHSRP eCLIP data from K562 cell as well as RNAseq data from KHSRP knockdown and respective mock controls.

This is a preview of subscription content, log in via an institution.

Springer Nature is developing a new tool to find and evaluate Protocols. Learn more

References

  1. Miller AD, Curran T, Verma IM (1984) c-fos protein can induce cellular transformation: a novel mechanism of activation of a cellular oncogene. Cell 36(1):51–60

    Article  CAS  PubMed  Google Scholar 

  2. Wilson T, Treisman R (1988) Removal of poly(A) and consequent degradation of c-fos mRNA facilitated by 3′ AU-rich sequences. Nature 336(6197):396–399. https://doi.org/10.1038/336396a0

    Article  CAS  PubMed  Google Scholar 

  3. Shyu AB, Belasco JG, Greenberg ME (1991) Two distinct destabilizing elements in the c-fos message trigger deadenylation as a first step in rapid mRNA decay. Genes Dev 5(2):221–231

    Article  CAS  PubMed  Google Scholar 

  4. Xu N, Chen CY, Shyu AB (1997) Modulation of the fate of cytoplasmic mRNA by AU-rich elements: key sequence features controlling mRNA deadenylation and decay. Mol Cell Biol 17(8):4611–4621

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Nilsen TW (2007) Mechanisms of microRNA-mediated gene regulation in animal cells. Trends Genet 23(5):243–249. https://doi.org/10.1016/j.tig.2007.02.011

    Article  CAS  PubMed  Google Scholar 

  6. Filipowicz W, Bhattacharyya SN, Sonenberg N (2008) Mechanisms of post-transcriptional regulation by microRNAs: are the answers in sight? Nat Rev Genet 9(2):102–114. https://doi.org/10.1038/nrg2290

    Article  CAS  PubMed  Google Scholar 

  7. Guo H, Ingolia NT, Weissman JS, Bartel DP (2010) Mammalian microRNAs predominantly act to decrease target mRNA levels. Nature 466(7308):835–840. https://doi.org/10.1038/nature09267

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Presnyak V, Alhusaini N, Chen YH, Martin S, Morris N, Kline N, Olson S, Weinberg D, Baker KE, Graveley BR, Coller J (2015) Codon optimality is a major determinant of mRNA stability. Cell 160(6):1111–1124. https://doi.org/10.1016/j.cell.2015.02.029

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Danan C, Manickavel S, Hafner M (2016) PAR-CLIP: a method for transcriptome-wide identification of RNA binding protein interaction sites. Methods Mol Biol 1358:153–173. https://doi.org/10.1007/978-1-4939-3067-8_10

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Hafner M, Landthaler M, Burger L, Khorshid M, Hausser J, Berninger P, Rothballer A, Ascano M Jr, Jungkamp AC, Munschauer M, Ulrich A, Wardle GS, Dewell S, Zavolan M, Tuschl T (2010) Transcriptome-wide identification of RNA-binding protein and microRNA target sites by PAR-CLIP. Cell 141(1):129–141. https://doi.org/10.1016/j.cell.2010.03.009

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Konig J, Zarnack K, Rot G, Curk T, Kayikci M, Zupan B, Turner DJ, Luscombe NM, Ule J (2010) iCLIP reveals the function of hnRNP particles in splicing at individual nucleotide resolution. Nat Struct Mol Biol 17(7):909–915. https://doi.org/10.1038/nsmb.1838

    Article  PubMed  PubMed Central  Google Scholar 

  12. Baltz AG, Munschauer M, Schwanhausser B, Vasile A, Murakawa Y, Schueler M, Youngs N, Penfold-Brown D, Drew K, Milek M, Wyler E, Bonneau R, Selbach M, Dieterich C, Landthaler M (2012) The mRNA-bound proteome and its global occupancy profile on protein-coding transcripts. Mol Cell 46(5):674–690. https://doi.org/10.1016/j.molcel.2012.05.021

    Article  CAS  PubMed  Google Scholar 

  13. Castello A, Fischer B, Eichelbaum K, Horos R, Beckmann BM, Strein C, Davey NE, Humphreys DT, Preiss T, Steinmetz LM, Krijgsveld J, Hentze MW (2012) Insights into RNA biology from an atlas of mammalian mRNA-binding proteins. Cell 149(6):1393–1406. https://doi.org/10.1016/j.cell.2012.04.031

    Article  CAS  PubMed  Google Scholar 

  14. Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, Haussler D (2002) The human genome browser at UCSC. Genome Res 12(6):996–1006. https://doi.org/10.1101/gr.229102. Article published online before print in May 2002

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Rosenbloom KR, Armstrong J, Barber GP, Casper J, Clawson H, Diekhans M, Dreszer TR, Fujita PA, Guruvadoo L, Haeussler M, Harte RA, Heitner S, Hickey G, Hinrichs AS, Hubley R, Karolchik D, Learned K, Lee BT, Li CH, Miga KH, Nguyen N, Paten B, Raney BJ, Smit AF, Speir ML, Zweig AS, Haussler D, Kuhn RM, Kent WJ (2015) The UCSC Genome Browser database: 2015 update. Nucleic Acids Res 43(Database issue):D670–D681. https://doi.org/10.1093/nar/gku1177

    Article  CAS  PubMed  Google Scholar 

  16. Yates A, Akanni W, Amode MR, Barrell D, Billis K, Carvalho-Silva D, Cummins C, Clapham P, Fitzgerald S, Gil L, Giron CG, Gordon L, Hourlier T, Hunt SE, Janacek SH, Johnson N, Juettemann T, Keenan S, Lavidas I, Martin FJ, Maurel T, McLaren W, Murphy DN, Nag R, Nuhn M, Parker A, Patricio M, Pignatelli M, Rahtz M, Riat HS, Sheppard D, Taylor K, Thormann A, Vullo A, Wilder SP, Zadissa A, Birney E, Harrow J, Muffato M, Perry E, Ruffier M, Spudich G, Trevanion SJ, Cunningham F, Aken BL, Zerbino DR, Flicek P (2016) Ensembl 2016. Nucleic Acids Res 44(D1):D710–D716. https://doi.org/10.1093/nar/gkv1157

    Article  CAS  PubMed  Google Scholar 

  17. Yates A, Beal K, Keenan S, McLaren W, Pignatelli M, Ritchie GR, Ruffier M, Taylor K, Vullo A, Flicek P (2015) The Ensembl REST API: Ensembl data for any language. Bioinformatics 31(1):143–145. https://doi.org/10.1093/bioinformatics/btu613

    Article  CAS  PubMed  Google Scholar 

  18. Smedley D, Haider S, Durinck S, Pandini L, Provero P, Allen J, Arnaiz O, Awedh MH, Baldock R, Barbiera G, Bardou P, Beck T, Blake A, Bonierbale M, Brookes AJ, Bucci G, Buetti I, Burge S, Cabau C, Carlson JW, Chelala C, Chrysostomou C, Cittaro D, Collin O, Cordova R, Cutts RJ, Dassi E, Di Genova A, Djari A, Esposito A, Estrella H, Eyras E, Fernandez-Banet J, Forbes S, Free RC, Fujisawa T, Gadaleta E, Garcia-Manteiga JM, Goodstein D, Gray K, Guerra-Assuncao JA, Haggarty B, Han DJ, Han BW, Harris T, Harshbarger J, Hastings RK, Hayes RD, Hoede C, Hu S, Hu ZL, Hutchins L, Kan Z, Kawaji H, Keliet A, Kerhornou A, Kim S, Kinsella R, Klopp C, Kong L, Lawson D, Lazarevic D, Lee JH, Letellier T, Li CY, Lio P, Liu CJ, Luo J, Maass A, Mariette J, Maurel T, Merella S, Mohamed AM, Moreews F, Nabihoudine I, Ndegwa N, Noirot C, Perez-Llamas C, Primig M, Quattrone A, Quesneville H, Rambaldi D, Reecy J, Riba M, Rosanoff S, Saddiq AA, Salas E, Sallou O, Shepherd R, Simon R, Sperling L, Spooner W, Staines DM, Steinbach D, Stone K, Stupka E, Teague JW, Dayem Ullah AZ, Wang J, Ware D, Wong-Erasmus M, Youens-Clark K, Zadissa A, Zhang SJ, Kasprzyk A (2015) The BioMart community portal: an innovative alternative to large, centralized data repositories. Nucleic Acids Res 43(W1):W589–W598. https://doi.org/10.1093/nar/gkv350

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Consortium EP (2012) An integrated encyclopedia of DNA elements in the human genome. Nature 489(7414):57–74. https://doi.org/10.1038/nature11247

    Article  Google Scholar 

  20. Leinonen R, Sugawara H, Shumway M, International Nucleotide Sequence Database C (2011) The sequence read archive. Nucleic Acids Res 39(Database issue):D19–D21. https://doi.org/10.1093/nar/gkq1019

    Article  CAS  PubMed  Google Scholar 

  21. Kodama Y, Shumway M, Leinonen R, International Nucleotide Sequence Database C (2012) The Sequence Read Archive: explosive growth of sequencing data. Nucleic Acids Res 40(Database issue):D54–D56. https://doi.org/10.1093/nar/gkr854

    Article  CAS  PubMed  Google Scholar 

  22. Barrett T, Wilhite SE, Ledoux P, Evangelista C, Kim IF, Tomashevsky M, Marshall KA, Phillippy KH, Sherman PM, Holko M, Yefanov A, Lee H, Zhang N, Robertson CL, Serova N, Davis S, Soboleva A (2013) NCBI GEO: archive for functional genomics data sets—update. Nucleic Acids Res 41(Database issue):D991–D995. https://doi.org/10.1093/nar/gks1193

    CAS  PubMed  Google Scholar 

  23. Kozomara A, Griffiths-Jones S (2014) miRBase: annotating high confidence microRNAs using deep sequencing data. Nucleic Acids Res 42(Database issue):D68–D73. https://doi.org/10.1093/nar/gkt1181

    Article  CAS  PubMed  Google Scholar 

  24. Agarwal V, Bell GW, Nam JW, Bartel DP (2015) Predicting effective microRNA target sites in mammalian mRNAs. elife 4. https://doi.org/10.7554/eLife.05005

  25. Halees AS, El-Badrawi R, Khabar KS (2008) ARED Organism: expansion of ARED reveals AU-rich element cluster variations between human and mouse. Nucleic Acids Res 36(Database issue):D137–D140. https://doi.org/10.1093/nar/gkm959

    CAS  PubMed  Google Scholar 

  26. Dassi E, Re A, Leo S, Tebaldi T, Pasini L, Peroni D, Quattrone A (2014) AURA 2: empowering discovery of post-transcriptional networks. Translation (Austin) 2(1):e27738. https://doi.org/10.4161/trla.27738

    Google Scholar 

  27. Blin K, Dieterich C, Wurmus R, Rajewsky N, Landthaler M, Akalin A (2015) DoRiNA 2.0—upgrading the doRiNA database of RNA interactions in post-transcriptional regulation. Nucleic Acids Res 43(Database issue):D160–D167. https://doi.org/10.1093/nar/gku1180

    Article  CAS  PubMed  Google Scholar 

  28. Li JH, Liu S, Zhou H, Qu LH, Yang JH (2014) starBase v2.0: decoding miRNA-ceRNA, miRNA-ncRNA and protein-RNA interaction networks from large-scale CLIP-Seq data. Nucleic Acids Res 42(Database issue):D92–D97. https://doi.org/10.1093/nar/gkt1248

    Article  CAS  PubMed  Google Scholar 

  29. Bailey TL, Boden M, Buske FA, Frith M, Grant CE, Clementi L, Ren J, Li WW, Noble WS (2009) MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res 37(Web Server issue):W202–W208. https://doi.org/10.1093/nar/gkp335

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Heinz S, Benner C, Spann N, Bertolino E, Lin YC, Laslo P, Cheng JX, Murre C, Singh H, Glass CK (2010) Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol Cell 38(4):576–589. https://doi.org/10.1016/j.molcel.2010.05.004

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25(1):25–29. https://doi.org/10.1038/75556

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Gene Ontology C (2015) Gene Ontology Consortium: going forward. Nucleic Acids Res 43(Database issue):D1049–D1056. https://doi.org/10.1093/nar/gku1179

    Article  Google Scholar 

  33. Huang d W, Sherman BT, Lempicki RA (2009) Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res 37(1):1–13. https://doi.org/10.1093/nar/gkn923

    Article  Google Scholar 

  34. Huang d W, Sherman BT, Lempicki RA (2009) Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc 4(1):44–57. https://doi.org/10.1038/nprot.2008.211

    Article  CAS  Google Scholar 

  35. Alexa A, Rahnenfuhrer R (2016) topGO: enrichment analysis for gene ontology. R package version 2280

    Google Scholar 

  36. Li B, Dewey CN (2011) RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics 12:323. https://doi.org/10.1186/1471-2105-12-323

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Leng N, Dawson JA, Thomson JA, Ruotti V, Rissman AI, Smits BM, Haag JD, Gould MN, Stewart RM, Kendziorski C (2013) EBSeq: an empirical Bayes hierarchical model for inference in RNA-seq experiments. Bioinformatics 29(8):1035–1043. https://doi.org/10.1093/bioinformatics/btt087

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Trabucchi M, Briata P, Garcia-Mayoral M, Haase AD, Filipowicz W, Ramos A, Gherzi R, Rosenfeld MG (2009) The RNA-binding protein KSRP promotes the biogenesis of a subset of microRNAs. Nature 459(7249):1010–1014. https://doi.org/10.1038/nature08025

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Boucas J, Fritz C, Schmitt A, Riabinska A, Thelen L, Peifer M, Leeser U, Nuernberg P, Altmueller J, Gaestel M, Dieterich C, Reinhardt HC (2015) Label-free protein-RNA interactome analysis identifies Khsrp signaling downstream of the p38/Mk2 kinase complex as a critical modulator of cell cycle progression. PLoS One 10(5):e0125745. https://doi.org/10.1371/journal.pone.0125745

    Article  PubMed  PubMed Central  Google Scholar 

  40. Griffith O (2013) Tutorial: cheat sheet for one-based vs zero-based coordinate systems. https://www.biostars.org/p/84686/

Download references

Acknowledgments

ENCODE Consortium, the labs of Brenton Graveley at UConn and Gene Yeo at UCSD for generating the KHSRP knockdown and eCLIP data respectively.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jorge Boucas .

Editor information

Editors and Affiliations

1 Electronic Supplementary Material

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media LLC

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Boucas, J. (2018). Integration of ENCODE RNAseq and eCLIP Data Sets. In: Lamandé, S. (eds) mRNA Decay. Methods in Molecular Biology, vol 1720. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-7540-2_8

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-7540-2_8

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-7539-6

  • Online ISBN: 978-1-4939-7540-2

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics