Advertisement

Computationally Characterizing Protein-Bound Long Noncoding RNAs and Their Secondary Structure Using Protein Interaction Profile Sequencing (PIP-Seq) in Plants

  • Mengge Shan
  • Zachary D. Anderson
  • Brian D. GregoryEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 1933)

Abstract

Two major components of posttranscriptional regulation are RNA–protein interactions and RNA secondary structure. While noncoding RNAs are far more abundant than messenger RNAs in eukaryotic systems, their functions remain largely unstudied. Evidence suggests that RNA–protein interactions and RNA secondary structure also regulate the function of long noncoding RNAs (lncRNAs), which are noncoding RNAs over 200 nucleotides (nt) in length. Protein interaction profile sequencing (PIP-seq) allows researchers to perform an unbiased screen of protein-bound regions and secondary structure of RNAs throughout a transcriptome of interest. Using a peak calling approach, our pipeline is able to identify protein-protected sites (PPSs), which are putative RNA–protein interaction sites. Additionally, by taking the ratio of read coverages in double-stranded RNA (dsRNA)-seq compared to single-stranded RNA (ssRNA)-seq libraries, our analysis can also calculate an RNA secondary structure score that reflects the likelihood of a region being comprised of double- or single-stranded ribonucleotides. Researchers can also use this pipeline to look at specific regions of interest, such as known lncRNAs, and determine their protein-bound status as well as elucidate their secondary structure.

Key words

Long noncoding RNA (lncRNA) RNA secondary structure RNA–protein interaction RNA-binding proteins (RBPs) Messenger RNAs RNA sequencing 

Notes

Acknowledgments

The authors would like to thank the members of the Gregory lab both past and present for helpful discussions. This work was funded by NSF grants MCB-1243947, MCB-1623887, and IOS-1444490 to B.D.G.

References

  1. 1.
    Vandivier LE, Anderson SJ, Foley SW et al (2016) The conservation and function of RNA secondary structure in plants. Annu Rev Plant Biol 67:463–488.  https://doi.org/10.1146/annurev-arplant-043015-111754CrossRefPubMedPubMedCentralGoogle Scholar
  2. 2.
    Ponjavic J, Pontig CP, Lunter G (2007) Functionality or transcriptional noise? Evidence for selection within long noncoding RNAs. Genome Res 17:556–565CrossRefGoogle Scholar
  3. 3.
    Chodroff RA, Goodstadt L, Sirey TM, Oliver PL et al (2010) Long noncoding RNA genes: conservation of sequence and brain expression among diverse amniotes. Genome Biol 11:R72.  https://doi.org/10.1186/gb-2010-11-7-r72CrossRefPubMedPubMedCentralGoogle Scholar
  4. 4.
    Barra J, Leucci E (2017) Probing long non-coding RNA-protein interactions. Front Mol Biosci 4:45.  https://doi.org/10.3389/fmolb.2017.00045CrossRefPubMedPubMedCentralGoogle Scholar
  5. 5.
    Foley SW, Gosai SJ, Wang D et al (2017) A global view of RNA-protein interactions identifies post-transcriptional regulators of root hair cell fate. Dev Cell 41:204–220.  https://doi.org/10.1016/j.devcel.2017.03.018CrossRefPubMedPubMedCentralGoogle Scholar
  6. 6.
    Gosai SJ, Foley SW, Wang D et al (2015) Global analysis of the RNA-protein interaction and RNA secondary structure landscapes of the Arabidopsis nucleus. Mol Cell 57:376–388.  https://doi.org/10.1016/j.molcel.2014CrossRefGoogle Scholar
  7. 7.
    Silverman IM, Li F, Alexander A et al (2014) RNase-mediated protein footprint sequencing reveals protein-binding sites throughout the human transcriptome. Genome Biol 15:R3.  https://doi.org/10.1186/gb-2014-15-1-r3CrossRefPubMedPubMedCentralGoogle Scholar
  8. 8.
    Li H, Handsaker B, Wysoker A et al (2009) The Sequence alignment/map (SAM) format and SAMtools. Bioinformatics 25:2078–2019.  https://doi.org/10.1093/bioinformatics/btp352CrossRefPubMedPubMedCentralGoogle Scholar
  9. 9.
    Li H (2011) A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics 27:2987–2993.  https://doi.org/10.1093/bioinformatics/btr509CrossRefPubMedPubMedCentralGoogle Scholar
  10. 10.
    Quinlan AR, Hall IM (2010) BEDtools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26:841–842.  https://doi.org/10.1093/bioinformatics/btq033CrossRefPubMedPubMedCentralGoogle Scholar
  11. 11.
    Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J 17(1):10–12.  https://doi.org/10.14806/ej.17.1.200CrossRefGoogle Scholar
  12. 12.
    Trapnell C, Pachter L, Salzberg S (2009) TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25:1105–1111.  https://doi.org/10.1093/bioinformatics/btp120CrossRefPubMedPubMedCentralGoogle Scholar
  13. 13.
    Muino J, Kaufmann K, van Ham R et al (2011) ChIP-seq Analysis in R (CSAR): an R package for the statistical detection of protein-bound genomic regions. Plant Methods 7:11.  https://doi.org/10.1186/1746-4811-7-11CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  • Mengge Shan
    • 1
    • 2
  • Zachary D. Anderson
    • 1
  • Brian D. Gregory
    • 1
    • 2
    Email author
  1. 1.Department of BiologyUniversity of PennsylvaniaPhiladelphiaUSA
  2. 2.Genomics and Computational Biology Graduate GroupUniversity of PennsylvaniaPhiladelphiaUSA

Personalised recommendations