Skip to main content

A New Bioinformatic Pipeline to Address the Most Common Requirements in RNA-seq Data Analysis

  • Conference paper
  • First Online:
9th International Conference on Practical Applications of Computational Biology and Bioinformatics

Abstract

Many bioinformatic programs have been developed to analyze data from RNA-seq experiments. These programs are widely used and often included in computational pipelines. Nevertheless, there does not seem to be a precise definition of what constitutes a proper workflow for this kind of data. We present here a new workflow that takes into account the most common requirements for RNA-seq analysis, and that is implemented as an automatic pipeline to perform an efficient and complete evaluation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Wang, Z., Gerstein, M., Snyder, M.: RNA-Seq: a revolutionary tool for transcriptomics. Nat. Rev. Genet. 10(1), 57–63 (2009)

    Article  Google Scholar 

  2. International Cancer Genome Consortium, et al: International network of cancer genome projects. Nature 464(7291), 993–998 (2010)

    Article  Google Scholar 

  3. Abbott, A.: Europe to map the human epigenome. Nature 477(7366), 518 (2011)

    Article  Google Scholar 

  4. ENCODE Project Consortium: An integrated encyclopedia of DNA elements in the human genome. Nature 489(7414), 57–74 (2012)

    Article  Google Scholar 

  5. Cancer Genome Atlas Research Network et al.: The cancer genome atlas pan-cancer analysis project. Nat. Genet. 45(10), 1113–1120 (2013)

    Google Scholar 

  6. Goncalves, A., Tikhonov, A., Brazma, A., Kapushesky, M.: A pipeline for RNA-seq data processing and quality assessment. Bioinformatics 27(6), 867–869 (2011)

    Article  Google Scholar 

  7. Goecks, J., Nekrutenko, A., Taylor, J.: Galaxy Team. Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biol. 11(8), R86 (2010)

    Google Scholar 

  8. Cumbie, J.S., Kimbrel, J.A., Di, Y., Schafer, D.W., Wilhelm, L.J., Fox, S.E., Sullivan, C.M., Curzon, A.D., Carrington, J.C., Mockler, T.C., Chang, J.H.: GENE-counter: a computational pipeline for the analysis of RNA-Seq data for gene expression differences. PLoS ONE 6(10), e25279 (2011)

    Article  Google Scholar 

  9. Reich, M., Liefeld, T., Gould, J., Lerner, J., Tamayo, P., Mesirov, J.P.: GenePattern 2.0. Nat. Genet. 38(5), 500–501 (2006)

    Article  Google Scholar 

  10. Knowles, D.G., Röder, M., Merkel, A., Guigó, R.: Grape RNA-Seq analysis pipeline environment. Bioinformatics 29(5), 614–621 (2013)

    Article  Google Scholar 

  11. Kalari, K.R., Nair, A.A., Bhavsar, J.D., O’Brien, D.R., Davila, J.I., Bockol, M.A., Nie, J., Tang, X., Baheti, S., Doughty, J.B., Middha, S., Sicotte, H., Thompson, A.E., Asmann, Y.W., Kocher, J.P.: MAP-RSeq: mayo analysis pipeline for RNA sequencing. BMC Bioinform. 15, 224 (2014)

    Article  Google Scholar 

  12. Torres-García, W., Zheng, S., Sivachenko, A., Vegesna, R., Wang, Q., Yao, R., Berger, M.F., Weinstein, J.N., Getz, G., Verhaak, R.G.: PRADA: pipeline for RNA sequencing data analysis. Bioinformatics 30(15), 2224–2226 (2014)

    Article  Google Scholar 

  13. Engström, P.G., Steijger, T., Sipos, B., Grant, G.R., Kahles, A., Rätsch, G., Goldman, N., Hubbard, T.J., Harrow, J., Guigó, R.: Bertone P; RGASP Consortium. Systematic evaluation of spliced alignment programs for RNA-seq data. Nat. Methods 10(12), 1185–1191 (2013)

    Article  Google Scholar 

  14. Soneson, C., Delorenzi, M.: A comparison of methods for differential expression analysis of RNA-seq data. BMC Bioinform. 14, 91 (2013)

    Article  Google Scholar 

  15. Rapaport, F., Khanin, R., Liang, Y., Pirun, M., Krek, A., Zumbo, P., Mason, C.E., Socci, N.D., Betel, D.: Comprehensive evaluation of differential gene expression analysis methods for RNA-seq data. Genome Biol. 14(9), R95 (2013)

    Article  Google Scholar 

  16. Steijger, T., Abril, J.F., Engström, P.G., Kokocinski, F., Hubbard, T.J., Guigó, R., Harrow, J., Bertone, P.: RGASP Consortium. Assessment of transcript reconstruction methods for RNA-seq. Nat. Methods 10(12), 1177–1184 (2013)

    Google Scholar 

  17. Fonseca, N.A., Marioni, J., Brazma, A.: RNA-Seq gene profiling - A systematic empirical comparison. PLoS ONE 9(9), e107026 (2014)

    Article  Google Scholar 

  18. Rubio-Camarillo, M., Gómez-López, G., Fernández, J.M., Valencia, A., Pisano, D.G.: RUbioSeq: a suite of parallelized pipelines to automate exome variation and bisulfite-seq analyses. Bioinformatics 29(13), 1687–1689 (2013)

    Article  Google Scholar 

  19. Cock, P.J., Fields, C.J., Goto, N., Heuer, M.L., Rice, P.M.: The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucl. Acids Res. 38(6), 1767–1771 (2010)

    Article  Google Scholar 

  20. Trapnell, C., et al.: Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat. Protoc. 7, 562–578 (2012)

    Article  Google Scholar 

  21. Langmead, B., Trapnell, C., Pop, M., Salzberg, S.L.: Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10(3), R25 (2009)

    Article  Google Scholar 

  22. Li, H., et al.: The Sequence Alignment/Map format and SAMtools. Bioinformatics 25(16), 2078–2079 (2009)

    Article  Google Scholar 

  23. Lovén, J., Orlando, D.A., Sigova, A.A., Lin, C.Y., Rahl, P.B., Burge, C.B., Levens, D.L., Lee, T.I., Young, R.A.: Revisiting global gene expression analysis. Cell 151(3), 476–482 (2012)

    Article  Google Scholar 

  24. Subramanian, A., Tamayo, P., Mootha, V.K., Mukherjee, S., Ebert, B.L., Gillette, M.A., Paulovich, A., Pomeroy, S.L., Golub, T.R., Lander, E.S., Mesirov, J.P.: Gene set enrichment analysis: a knowledge based approach for interpreting genome-wide expression profiles. Proc. Natl. Acad. Sci. U S A 102(43), 15545–15550 (2005)

    Article  Google Scholar 

  25. Anders, S., Pyl, P.T., Huber, W.: HTSeq-a Python framework to work with high-throughput sequencing data. Bioinformatics 31(2), 166–169 (2015)

    Article  Google Scholar 

  26. Anders, S., McCarthy, D.J., Chen, Y., Okoniewski, M., Smyth, G.K., Huber, W., Robinson, M.D.: Count-based differential expression analysis of RNA sequencing data using R and Bioconductor. Nat. Protoc. 8(9), 1765–1786 (2013)

    Article  Google Scholar 

  27. Kim, D., Salzberg, S.L.: TopHat-Fusion: an algorithm for discovery of novel fusion transcripts. Genome Biol. 12(8), R72 (2011)

    Article  Google Scholar 

  28. Quinlan, A.R., Hall, I.M.: BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26(6), 841–842 (2010)

    Article  Google Scholar 

Download references

Acknowledgements

This work was partially funded by the [14VI05] Contract-Programme from the University of Vigo. Also, it was supported by the European Union’s Seventh Framework Programme FP7/REGPOT-2012-2013.1 under grant agreement n° 316265 (BIOCAPS), the Agrupamento INBIOMED from DXPCTSUG-FEDER “unha maneira de facer Europa” (2012/273) and the “Platform of integration of intelligent techniques for analysis of biomedical information” project (TIN2013-47153-C3-3-R) from the Spanish Ministry of Economy and Competitiveness.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daniel Glez-Peña .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Graña, O., Rubio-Camarillo, M., Fdez-Riverola, F., Pisano, D.G., Glez-Peña, D. (2015). A New Bioinformatic Pipeline to Address the Most Common Requirements in RNA-seq Data Analysis. In: Overbeek, R., Rocha, M., Fdez-Riverola, F., De Paz, J. (eds) 9th International Conference on Practical Applications of Computational Biology and Bioinformatics. Advances in Intelligent Systems and Computing, vol 375. Springer, Cham. https://doi.org/10.1007/978-3-319-19776-0_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-19776-0_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-19775-3

  • Online ISBN: 978-3-319-19776-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics