Skip to main content

Bioinformatics Pipeline for Transcriptome Sequencing Analysis

  • Protocol
  • First Online:
Book cover Enhancer RNAs

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1468))

Abstract

The development of High Throughput Sequencing (HTS) for RNA profiling (RNA-seq) has shed light on the diversity of transcriptomes. While RNA-seq is becoming a de facto standard for monitoring the population of expressed transcripts in a given condition at a specific time, processing the huge amount of data it generates requires dedicated bioinformatics programs. Here, we describe a standard bioinformatics protocol using state-of-the-art tools, the STAR mapper to align reads onto a reference genome, Cufflinks to reconstruct the transcriptome, and RSEM to quantify expression levels of genes and transcripts. We present the workflow using human transcriptome sequencing data from two biological replicates of the K562 cell line produced as part of the ENCODE3 project.

The original version of this chapter was revised. An erratum to this chapter can be found at DOI 10.1007/978-1-4939-4035-6_17

An erratum to this chapter can be found at http://dx.doi.org/10.1007/978-1-4939-4035-6_17

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Wang Z, Gerstein M, Snyder M (2009) RNA-seq: a revolutionary tool for transcriptomics. Nature 10:57–63

    CAS  Google Scholar 

  2. Djebali S, Davis CA, Merkel A et al (2012) Landscape of transcription in human cells. Nature 488:101–108

    Article  Google Scholar 

  3. Dobin A, Davis CA, Schlesinger F et al (2012) STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29:15–21

    Article  PubMed  PubMed Central  Google Scholar 

  4. Trapnell C, Williams BA, Pertea G et al (2010) Transcript assembly and quantification by RNA-seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol 28:511–515

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Li B, Ruotti V, Stewart RM et al (2010) RNA-seq gene expression estimation with read mapping uncertainty. Bioinformatics 26:493–500

    Article  PubMed  Google Scholar 

  6. T.E.P. Consortium, T.E.P. Consortium, O.C. Data Analysis Coordination et al (2013) An integrated encyclopedia of DNA elements in the human genome. Nature 488:57–74

    Google Scholar 

  7. Martens JHA, Stunnenberg HG (2013) BLUEPRINT: mapping human blood cell epigenomes. Haematologica 98:1487–1489

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Steijger T, Abril JF, Engström PG et al (2013) Assessment of transcript reconstruction methods for RNA-seq. Nat Methods 10:1177–1184

    Google Scholar 

  9. Engström PG, Steijger T, Sipos B et al (2013) Systematic evaluation of spliced alignment programs for RNA-seq data. Nat Methods 10:1185–1191

    Google Scholar 

  10. Roberts A, Goff L, Pertea G et al (2012) Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat Protoc 7:562–578

    Article  PubMed  PubMed Central  Google Scholar 

  11. Marco-Sola S, Sammeth M, Guigó R et al (2012) The GEM mapper: fast, accurate and versatile alignment by filtration. Nat Methods 9:1185–1188

    Google Scholar 

  12. Pertea M, Pertea GM, Antonescu CM et al (2015) StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat Biotechnol 33:290–295

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Montgomery SB, Sammeth M, Gutierrez-Arcelus M et al (2010) Transcriptome genetics using second generation sequencing in a Caucasian population. Nature 464:773–777

    Article  CAS  PubMed  Google Scholar 

  14. Roberts A, Pachter L (2013) Streaming fragment assignment for real-time analysis of sequencing experiments. Nat Methods 10:71–73

    Article  CAS  PubMed  Google Scholar 

  15. Patro R, Mount SM, Kingsford C (2014) Sailfish enables alignment-free isoform quantification from RNA-seq reads using lightweight algorithms. Nat Biotechnol 32:462–464

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Haas BJ, Papanicolaou A, Yassour M et al (2013) De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat Protoc 8:1494–1512

    Article  CAS  PubMed  Google Scholar 

  17. Sacomoto GAT, Kielbassa J, Chikhi R et al (2012) KISSPLICE: de-novo calling alternative splicing events from RNA-seq data. BMC Bioinformatics 13(Suppl 6):S5

    PubMed  PubMed Central  Google Scholar 

  18. Rosenbloom KR, Sloan CA, Malladi VS et al (2013) ENCODE data in the UCSC Genome Browser: year 5 update. Nucleic Acids Res 41:D56–D63

    Article  CAS  PubMed  Google Scholar 

  19. Harrow J, Frankish A, Gonzalez JM et al (2012) GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res 22:1760–1774

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Derrien T, Johnson R, Bussotti G et al (2012) The GENCODE v7 catalog of human long noncoding RNAs: analysis of their gene structure, evolution, and expression. Genome Res 22:1775–1789

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Pei B, Sisu C, Frankish A et al (2012) The GENCODE pseudogene resource. Genome Biol 13:R51

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Li H, Handsaker B, Wysoker A et al (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079

    Article  PubMed  PubMed Central  Google Scholar 

  23. Cunningham F, Amode MR, Barrell D et al (2015) Ensembl 2015. Nucleic Acids Res 43:D662–D669

    Article  PubMed  Google Scholar 

  24. Knowles DG, Röder M, Merkel A et al (2013) Grape RNA-seq analysis pipeline environment. Bioinformatics 29:614–621

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Jiang L, Schlesinger F, Davis CA et al (2011) Synthetic spike-in standards for RNA-seq experiments. Genome Res 21:1543–1551

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Risso D, Ngai J, Speed TP et al (2014) Normalization of RNA-seq data using factor analysis of control genes or samples. Nat Biotechnol 32:896–902

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Sarah Djebali Ph.D. or Thomas Derrien Ph.D. .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Science+Business Media New York

About this protocol

Cite this protocol

Djebali, S., Wucher, V., Foissac, S., Hitte, C., Corre, E., Derrien, T. (2017). Bioinformatics Pipeline for Transcriptome Sequencing Analysis. In: Ørom, U. (eds) Enhancer RNAs. Methods in Molecular Biology, vol 1468. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-4035-6_14

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-4035-6_14

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-4033-2

  • Online ISBN: 978-1-4939-4035-6

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics