Using RAMPAGE to Identify and Annotate Promoters in Insect Genomes
- 1k Downloads
Application of Transcription Start Site (TSS) profiling technologies, coupled with large-scale next-generation sequencing (NGS) has yielded valuable insights into the location, structure, and activity of promoters across diverse metazoan model systems. In insects, TSS profiling has been used to characterize the promoter architecture of Drosophila melanogaster (Hoskins et al., Genome Res 21(2):182–192, 2011) and subsequently was employed to reveal widespread transposon-driven alternative promoter usage in the fruit fly (Batut et al., Genome Res 23:169–180, 2012).
In this chapter we discuss the computational analysis of the experimental data derived from one of TSS profiling methods, RAMPAGE (RNA Annotation and Mapping of Promoters for Analysis of Gene Expression) that can be used for the precise, quantitative identification of promoters in insect genomes. We demonstrate this using the software tools GoRAMPAGE (Brendel and Raborn, GoRAMPAGE—A workflow for promoter detection by 5′-read mapping. https://github.com/BrendelGroup/GoRAMPAGE, 2016) and TSRchitect (Raborn and Brendel, TSRchitect: promoter identification from large-scale TSS profiling data. R Bioconductor package version 1.8.0 [Online]. Available: http://bioconductor.org/packages/release/bioc/html/TSRchitect.html, 2017), providing detailed instructions with the aim of taking the user from raw reads to processed results.
Key wordscis-regulatory regions Promoter architecture Transcription initiation Transcription start sites (TSSs)
The authors would like to thank Philippe Batut for generous technical assistance with the RAMPAGE protocol, and to Nathan Keith for his help establishing the protocol in our laboratory. The authors are grateful to Thomas W. McCarthy for his help testing the code and providing editorial feedback.
Disclosure Declaration The authors declare that they have no competing interests.
- 3.Carninci P, Kasukawa T, Katayama S, Gough J, Frith MC, Maeda N et al (2005) The transcriptional landscape of the mammalian genome. Science (New York, NY) 309(5740):1559–1563Google Scholar
- 11.Carninci P, Sandelin A, Lenhard B, Katayama S, Shimokawa K, Ponjavic J et al (2006) Genome-wide analysis of mammalian promoter architecture and evolution. Nat Gen 38(6):626–635Google Scholar
- 12.Mwangi S, Attardo G, Suzuki Y, Aksoy S, Christoffels A (2015) TSS seq based core promoter architecture in blood feeding Tsetse fly (Glossina morsitans morsitans) vector of Trypanosomiasis. BMC Genomics 16(1):722Google Scholar
- 16.Batut PJ, Gingeras TR (2013) RAMPAGE: promoter activity profiling by paired-end sequencing of 5’-complete cDNAs. In: Ausubel FM et al (eds) Current protocols in molecular biology. Wiley, Hoboken, pp 25B.11.1–25B.11.16Google Scholar
- 18.Cumbie JS, Ivanchenko MG, Megraw M (2015) NanoCAGE-XL and CapFilter: an approach to genome wide identification of high confidence transcription start sites. BMC Genomics 16(1):528Google Scholar
- 20.ENCODE Project Consortium (2012) An integrated encyclopedia of DNA elements in the human genome. Nature 489(7414):57–74Google Scholar
- 21.Consortium E (2017) Rampage and cage data standards and processing pipeline [Online]. Available: https://www.encodeproject.org/rampage/
- 23.Stewart CA, Cockerill TM, Foster I, Hancock D, Merchant N, Skidmore E et al (2015) Jetstream: a self-provisioned, scalable science and engineering cloud environment. In: Proceedings of the 2015 XSEDE conference: scientific advancements enabled by enhanced cyberinfrastructure. XSEDE ’15. ACM, New York, pp 29:1–29:8 [Online]. Available: https://doi.org/10.1145/2792745.2792774
- 24.Leinonen R, Sugawara H, Shumway M, International Nucleotide Sequence Database Collaboration (2011) The sequence read archive. Nucleic Acids Res 39(Database issue):D19–D21Google Scholar
- 25.Brendel VP, Raborn RT (2016) GoRAMPAGE- a workflow for promoter detection by 5’-Read mapping. https://github.com/BrendelGroup/GoRAMPAGE
- 27.Lab H, FASTX Toolkit [Online]. Available: http://hannonlab.cshl.edu/fastx_toolkit/
- 28.Lassmann T (2015) TagDust2: a generic method to extract reads from sequencing data. BMC Bioinform 16(1):1Google Scholar
- 30.Dobin A, Gingeras TR (2016) Optimizing RNA-Seq mapping with STAR. In: Transcription factor regulatory networks. Springer, New York, pp 245–262Google Scholar
- 31.R Core Team (2017) R: a language and environment for statistical computing. R foundation for statistical computing, Vienna [Online]. Available: https://www.R-project.org
- 33.Raborn RT, Brendel V (2017) TSRchitect: promoter identification from large-scale TSS profiling data. r Bioconductor package version 1.0.0 [Online]. Available: http://bioconductor.org/packages/release/bioc/html/TSRchitect.html
- 35.Haberle V, Forrest ARR, Hayashizaki Y, Carninci P, Lenhard B (2015) CAGEr: precise TSS data retrieval and high-resolution promoterome mining for integrative analyses. Nucleic Acids Res 43(8):gkv054–e51Google Scholar
- 36.Pagès H (2016) BSgenome: infrastructure for biostrings-based genome data packages and support for efficient SNP representation. R package version 1.42.0Google Scholar
- 39.Tange O (2018) GNU parallel 2018, p 112. ISBN 978-1-387-50988-1Google Scholar