Skip to main content

Isoform Expression Analysis Based on RNA-seq Data

  • Chapter
  • First Online:
Statistical Analysis of Next Generation Sequencing Data

Part of the book series: Frontiers in Probability and the Statistical Sciences ((FROPROSTAS))

  • 7504 Accesses

Abstract

The development of novel high-throughput DNA sequencing methods has provided a powerful method for both mapping and quantifying transcriptomes. This method, termed RNA-seq (RNA sequencing), has advantages over microarray-based approaches in terms of wide dynamic range of expressions, less reliance on existing knowledge about genome sequence, and low background noise. After aligning the reads to the reference genomes, the first step of RNA-seq analysis is to infer relative transcript abundances. This can be done at the whole transcript level, at the isoform-specific relative abundance level assuming a known set of isoforms, and at the level where transcripts are identified and their abundances are quantified. We review these methods briefly and add some recent developments in dealing with non-uniform read distribution within a transcript. We focus on methods for simultaneous transcript discovery and quantification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Cabili, M.N., Trapnell, C., Goff, L., Koziol, M., Tazon-Vega, B., Regev, A., Rinn, J.L.: Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. Genes Dev. 25(18), 1915–1927 (2011)

    Article  Google Scholar 

  2. Feng, J., Li, W., Jiang, T.: Inference of isoforms from short sequence reads. J. Comput. Biol. 8(3), 305–321 (2011)

    Article  MathSciNet  Google Scholar 

  3. Guttman, M., Rinn, J.: Modular regulatory principles of large non-coding RNAs. Nature 482, 339–346 (2012)

    Article  Google Scholar 

  4. Guttman, M., Garber, M., Levin, J.Z., Donaghey, J., Robinson, J., Adiconis, X., Fan, L., Koziol, M.J., Gnirke, A., Nusbaum, C., Rinn, J.L., Lander, E.S., Regev, A.: Ab initio reconstruction of transcriptomes of pluripotent and lineage committed cells reveals gene structures of thousands of lincRNAs. Nat. Biotech. 28(5), 503–510 (2010)

    Article  Google Scholar 

  5. Heber, S., Alekseyev, M., Sze, S., Tang, H., Pevzner, P.A.: Splicing graphs and EST assembly problem. Bioinformatics 18, S181–S188 (2002)

    Article  Google Scholar 

  6. Hu, Y., Liu, Y., Mao, X., Jia, C., Ferguson, J., Xue, C., Reilly, M., Li, H., Li, M.: PennSeq: accurate isoform-specific gene expression quantification in RNA-seq by modeling non-uniform read distribution. Nucleic Acids Res. 42(3), e20 (2014)

    Article  Google Scholar 

  7. Jiang, H., Salzman, J.: A penalized likelihood approach for robust estimation of isoform expression arXiv:1310.0379 (2013, preprint)

    Google Scholar 

  8. Jiang, H., Wong, W.H.: Statistical inferences for isoform expression in RNA-seq. Bioinformatics 25, 1026–1032 (2009)

    Article  Google Scholar 

  9. Langmead, B., Trapnell, C., Pop, M., Salzberg, S.L.: Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009)

    Article  Google Scholar 

  10. Lappalainen, T., Sammeth, A., Friedlander, M.R., et al.: Transcriptome and genome sequencing uncovers functional variation in humans. Nature 501, 506–511 (2013)

    Article  Google Scholar 

  11. LeGault, L.H., Dewey, C.N.: Inference of alternative splicing from RNA-seq data with probabilistic splice graphs. Bioinformatics 29(18), 2300–2310 (2013)

    Article  Google Scholar 

  12. Li, W., Jiang, T.: Transcriptome assembly and isoform expression level estimation from biased RNA-seq reads. Bioinformatics 28(22), 2914–2921 (2012)

    Article  Google Scholar 

  13. Li, J., Jiang, H., Wong, W.H.: Modeling non-uniformity in short-read rates in RNA-seq data. Genome Biol. 11, R50 (2010)

    Article  Google Scholar 

  14. Li, W., Feng, J., Jiang, T.: IsoLasso: a LASSO regression approach to RNA-seq based transcriptome assembly. J. Comput. Biol. 88(11), 1693–1707 (2011)

    Article  MathSciNet  Google Scholar 

  15. Li, J.J., Jiang, C.R., Brown, J.B., Huang, H., Bickel, P.J.: Sparse linear modeling of next-generation mRNA sequencing (RNA-seq) data for isoform discovery and abundance estimation. Proc. Natl. Acad. Sci. 108(50), 19867–19872 (2012)

    Article  Google Scholar 

  16. Mezlini, A.M., Smith, E.J., Fiume, M., Buske, O., Savich, G.L., Shah, S., Aparicio, S., Chiang, D.Y., Goldenberg, A., Brudno, M.: iReckon: simultaneous isoform discovery and abundance estimation from RNA-seq data. Genome Res. 23(3), 519–529 (2013)

    Article  Google Scholar 

  17. Montgomery, S.B., Sammeth, M., Gutierrez-Arcelus, M., Lach, R.P., Ingle, C., Nisbett, J., Guigo, R., Dermitzakis, E.T.: Transcriptome genetics using second generation sequencing in a Caucasian population. Nature 464, 773–777 (2010)

    Article  Google Scholar 

  18. Mortazavi, A., Williams, B. A., McCue, K., Schaeffer, L., Wold, B.: Mapping and quantifying mammalian transcriptomes by RNA-seq. Nat. Meth. 5, 621–628 (2008)

    Article  Google Scholar 

  19. Pachter, L.: Models for transcript quantification from RNA-seq. Technical Report. University of California, Berkeley (2013)

    Google Scholar 

  20. Pickrell, J.K., Marioni, J.C., Pai, A.A., Degner, J.F., Engelhardt, B.E., Nkadori, E., Veyrieras, J.B., Stephens, M., Gilad, Y., Pritchard, J.: Understanding mechanisms underlying human gene expression variation with RNA sequencing. Nature 464, 768–772 (2010)

    Article  Google Scholar 

  21. Rozowsky, J., Abyzov, A., Wang, J., Alves, P., Raha, D., Harmanci, A., Leng, J., Bjornson, R., Kong, Y., Kitabayashi, N., Bhardwaj, N., Rubin, M., Snyder, M., Gerstein, M.: AlleleSeq: analysis of allele-specific expression and binding in a network framework. Mol. Syst. Biol. 7, 522 (2011)

    Article  Google Scholar 

  22. Salzman, J., Jiang, H., Wong, W.H.: Statistical modeling of RNA-seq data. Stat. Sci. 26 (1), 62–83 (2011)

    Article  MATH  MathSciNet  Google Scholar 

  23. Skelly, D.A., Johansson, M., Madeoy, J., Wakefield, J., Akey, J.M.: A powerful and flexible statistical framework for testing hypotheses of allele-specific gene expression from RNA-seq data. Genome Res. 21, 1728–1738 (2011)

    Article  Google Scholar 

  24. Steijger, T., Abril, J.F., Engstrm, P.G., Kokocinski, F., The RGASP Consortium, Hubbard, T.J., Guig, R., Harrow, J., Berton, P.: Assessment of transcript reconstruction methods for RNRNA-seq. Nat. Meth. 10, 1177–1184 (2013)

    Google Scholar 

  25. Stevenson, K.R., Coolon, J.D., Wittkopp, P.J.: Sources of bias in measures of allele-specific expression derived from RNA-seq data aligned to a single reference genome. BMC Genom. 14, 536 (2013)

    Article  Google Scholar 

  26. Trapnell, C., Pachter, L., Salzberg, S.L.: TopHat: discovering splice junctions with RNA-seq. Bioinformatics 25(9), 1105–1111 (2009)

    Article  Google Scholar 

  27. Trapnell, C., Williams, B.A., Pertea, G., Mortazavi, A.M., Kwan, G., van Baren, M.J., Salzberg, S.L., Wold, B., Pachter, L.: Transcript assembly and quantification by RNA-seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat. Biotech. 28(5), 511–515 (2010)

    Article  Google Scholar 

  28. Turro, E., Su, S.Y., Gonçalves, Â., Coin, L.J., Richardson, S., Lewin, A.: Haplotype and isoform specific expression estimation using multi-mapping RNA-seq reads. Genome Biol. 12(2), R13 (2011)

    Article  Google Scholar 

  29. Vardhanabhuti, S., Li, M., Li, H.: A hierarchical Bayesian model for estimating and inferring differential isoform expression for multi-sample RNA-seq data. Stat. Biosci. 5(1), 244–258 (2013)

    Article  Google Scholar 

  30. Wang, Z., Gerstein, M., Snyder, M.: RNA-seq: a revolutionary tool for transcriptomics. Nat. Rev. Genet. 10(1), 57–63 (2009)

    Article  Google Scholar 

  31. Wu, T.W., Nacu, S.: Fast and SNP-tolerant detection of complex variants and splicing in short reads. Bioinformatics 26, 873–881 (2010)

    Article  Google Scholar 

Download references

Acknowledgements

This research is supported by NIH grants CA127334 and GM097505.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hongzhe Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Li, H. (2014). Isoform Expression Analysis Based on RNA-seq Data. In: Datta, S., Nettleton, D. (eds) Statistical Analysis of Next Generation Sequencing Data. Frontiers in Probability and the Statistical Sciences. Springer, Cham. https://doi.org/10.1007/978-3-319-07212-8_12

Download citation

Publish with us

Policies and ethics