High Throughput Sequencing-Based Approaches for Gene Expression Analysis

  • R. Raja Sekhara Reddy
  • M. V. Ramanujam
Part of the Methods in Molecular Biology book series (MIMB, volume 1783)


Next-generation sequencing has emerged as the method of choice to answer fundamental questions in biology. The massively parallel sequencing technology for RNA-Seq analysis enables better understanding of gene expression patterns in model and nonmodel organisms. Sequencing per se has reached the stage of commodity level while analyzing and interpreting huge amount of data has been a significant challenge. This chapter is aimed at discussing the complexities involved in sequencing and analysis, and tries to simplify sequencing based gene expression analysis. Biologists and experimental scientists were kept in mind while discussing the methods and analysis workflow.

Key words

RNA RNAseq Transcriptome NGS Gene expression 


  1. 1.
    Mardis ER (2008) Next-generation DNA sequencing methods. Annu Rev Genomics Hum Genet 9:387–402CrossRefGoogle Scholar
  2. 2.
    Buermans HPJ, den Dunnen JT (2014) Next generation sequencing technology: advances and applications. Biochim Biophys Acta BBA 1842:1932–1941CrossRefPubMedGoogle Scholar
  3. 3.
    Koboldt DC, Steinberg KM, Larson DE, Wilson RK, Mardis E (2013) The next-generation sequencing revolution and its impact on genomics. Cell 155:27–38CrossRefPubMedPubMedCentralGoogle Scholar
  4. 4.
    Mutz K-O, Heilkenbrinker A, Lönne M, Walter J-G, Stahl F (2013) Transcriptome analysis using next-generation sequencing. Curr Opin Biotechnol 24:22–30CrossRefPubMedGoogle Scholar
  5. 5.
    Mardis ER (2013) Next-generation sequencing platforms. Annu Rev Anal Chem (Palo Alto Calif) 6:287–303CrossRefGoogle Scholar
  6. 6.
    Manga P et al (2016) Replicates, read numbers, and other important experimental design considerations for microbial RNA-seq identified using Bacillus thuringiensis datasets. Front Microbiol 7:794CrossRefPubMedPubMedCentralGoogle Scholar
  7. 7.
    Schurch NJ et al (2016) How many biological replicates are needed in an RNA-seq experiment and which differential expression tool should you use? RNA 22:839–851CrossRefPubMedPubMedCentralGoogle Scholar
  8. 8.
    Rosenbloom KR et al (2013) ENCODE data in the UCSC Genome Browser: year 5 update. Nucleic Acids Res 41:D56–D63CrossRefPubMedGoogle Scholar
  9. 9.
    Sims D, Sudbery I, Ilott NE, Heger A, Ponting CP (2014) Sequencing depth and coverage: key considerations in genomic analyses. Nat Rev Genet 15:121–132CrossRefPubMedGoogle Scholar
  10. 10.
    Conesa A et al (2016) A survey of best practices for RNA-seq data analysis. Genome Biol 17:13CrossRefPubMedPubMedCentralGoogle Scholar
  11. 11.
    Afgan E et al (2016) The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update. Nucleic Acids Res 44:W3–W10CrossRefPubMedPubMedCentralGoogle Scholar
  12. 12.
    Ewing B, Green P (1998) Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res 8:186–194CrossRefPubMedGoogle Scholar
  13. 13.
    Field D et al (2006) Open software for biologists: from famine to feast. Nat Biotechnol 24:801–803CrossRefPubMedGoogle Scholar
  14. 14.
    Andrews, S. FastQC A Quality control tool for high throughput sequence data. Available at: Accessed: 29th June 2016Google Scholar
  15. 15.
    Babraham Bioinformatics - Trim Galore! Available at: Accessed: 30th January 2017Google Scholar
  16. 16.
    Bahl A et al (2003) PlasmoDB: the Plasmodium genome resource. A database integrating experimental and computational data. Nucleic Acids Res 31:212–215CrossRefPubMedPubMedCentralGoogle Scholar
  17. 17.
    Kim D, Langmead B, Salzberg SL (2015) HISAT: a fast spliced aligner with low memory requirements. Nat Methods 12:357–360CrossRefPubMedPubMedCentralGoogle Scholar
  18. 18.
    Okonechnikov K, Conesa A, García-Alcalde F (2016) Qualimap 2: advanced multi-sample quality control for high-throughput sequencing data. Bioinformatics 32:292–294PubMedPubMedCentralGoogle Scholar
  19. 19.
    Parekh S, Ziegenhain C, Vieth B, Enard W, Hellmann I (2016) The impact of amplification on differential expression analyses by RNA-seq. Sci Rep 6:25533CrossRefPubMedPubMedCentralGoogle Scholar
  20. 20.
    Picard Tools - By Broad Institute. Available at: Accessed: 31st January 2017Google Scholar
  21. 21.
    Tarasov A, Vilella AJ, Cuppen E, Nijman IJ, Prins P (2015) Sambamba: fast processing of NGS alignment formats. Bioinformatics 31:2032–2034CrossRefPubMedPubMedCentralGoogle Scholar
  22. 22.
    Thorvaldsdóttir H, Robinson JT, Mesirov JP (2013) Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief Bioinform 14:178–192CrossRefPubMedGoogle Scholar
  23. 23.
    Liao Y, Smyth GK, Shi W (2014) featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30:923–930CrossRefPubMedGoogle Scholar
  24. 24.
    Pertea M et al (2015) StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat Biotechnol 33:290–295CrossRefPubMedPubMedCentralGoogle Scholar
  25. 25.
    Anders S, Huber W (2010) Differential expression analysis for sequence count data. Genome Biol 11:1–12CrossRefGoogle Scholar
  26. 26.
    Robinson MD, McCarthy DJ, Smyth GK (2010) edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26:139–140CrossRefPubMedGoogle Scholar
  27. 27.
    Huang DW, Sherman BT, Lempicki RA (2009) Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc 4:44–57CrossRefGoogle Scholar
  28. 28.
    Grabherr MG et al (2011) Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol 29:644–652CrossRefPubMedPubMedCentralGoogle Scholar
  29. 29.
    Xie Y et al (2014) SOAPdenovo-Trans: de novo transcriptome assembly with short RNA-Seq reads. Bioinformatics 30:1660–1666CrossRefPubMedGoogle Scholar
  30. 30.
    Liu J et al (2016) BinPacker: packing-based de novo transcriptome assembly from RNA-seq data. PLoS Comput Biol 12:e1004772CrossRefPubMedPubMedCentralGoogle Scholar
  31. 31.
    Clarke K, Yang Y, Marsh R, Xie L, Zhang KK (2013) Comparative analysis of de novo transcriptome assembly. Sci China Life Sci 56:156–162CrossRefPubMedPubMedCentralGoogle Scholar
  32. 32.
    Durai DA, Schulz MH (2016) Informed kmer selection for de novo transcriptome assembly. Bioinformatics 32:1670–1677CrossRefPubMedPubMedCentralGoogle Scholar
  33. 33.
    Smith-Unna R, Boursnell C, Patro R, Hibberd JM, Kelly S (2016) TransRate: reference-free quality assessment of de novo transcriptome assemblies. Genome Res 26:1134–1144CrossRefPubMedPubMedCentralGoogle Scholar
  34. 34.
    Boetzer M, Henkel CV, Jansen HJ, Butler D, Pirovano W (2011) Scaffolding pre-assembled contigs using SSPACE. Bioinformatics 27:578–579CrossRefPubMedGoogle Scholar
  35. 35.
    Conesa A et al (2005) Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics 21:3674–3676CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • R. Raja Sekhara Reddy
    • 1
  • M. V. Ramanujam
    • 1
  1. 1.Clevergene Biocorp Private LimitedBangaloreIndia

Personalised recommendations