Advertisement

Modeling and analysis of RNA-seq data: a review from a statistical perspective

  • Wei Vivian Li
  • Jingyi Jessica Li
Review

Abstract

Background

Since the invention of next-generation RNA sequencing (RNA-seq) technologies, they have become a powerful tool to study the presence and quantity of RNA molecules in biological samples and have revolutionized transcriptomic studies. The analysis of RNA-seq data at four different levels (samples, genes, transcripts, and exons) involve multiple statistical and computational questions, some of which remain challenging up to date.

Results

We review RNA-seq analysis tools at the sample, gene, transcript, and exon levels from a statistical perspective. We also highlight the biological and statistical questions of most practical considerations.

Conclusions

The development of statistical and computational methods for analyzing RNA-seq data has made significant advances in the past decade. However, methods developed to answer the same biological question often rely on diverse statistical models and exhibit different performance under different scenarios. This review discusses and compares multiple commonly used statistical models regarding their assumptions, in the hope of helping users select appropriate methods as needed, as well as assisting developers for future method development.

Keywords

RNA-seq statistical modeling differentially expressed genes alternatively spliced exons isoform reconstruction and quantification 

Notes

Acknowledgements

This work was supported by the following grants: National Science Foundation DMS-1613338, NIH/NIGMS R01GM120507, PhRMA Foundation Research Starter Grant in Informatics, Johnson & Johnson WiSTEM2D Award, and Sloan Research Fellowship (to J.J.L) and the UCLA Dissertation Year Fellowship (to W.V.L). The authors would like to thank the insightful feedbacks from Dr. Lior Pachter at California Institute of Technology and Dr. Michael I. Love at University of North Carolina at Chapel Hill.

References

  1. 1.
    Wang, Z., Gerstein, M. and Snyder, M. (2009) RNA-seq: a revolutionary tool for transcriptomics. Nat. Rev. Genet., 10, 57–63CrossRefPubMedPubMedCentralGoogle Scholar
  2. 2.
    Zhao, S., Fung-Leung, W.-P., Bittner, A., Ngo, K. and Liu, X. (2014) Comparison of RNA-seq and microarray in transcriptome profiling of activated t cells. PLoS One, 9, e78644CrossRefPubMedPubMedCentralGoogle Scholar
  3. 3.
    Engström, P. G., Steijger, T., Sipos, B., Grant, G. R., Kahles, A., The RGASP Consortium, Rätsch, G., Goldman, N., Hubbard, T. J., Harrow, J., et al. (2013) Systematic evaluation of spliced alignment programs for RNA-seq data. Nat. Methods, 10, 1185–1191CrossRefPubMedPubMedCentralGoogle Scholar
  4. 4.
    Soneson, C. and Delorenzi, M. (2013) A comparison of methods for differential expression analysis of RNA-seq data. BMC Bioinformatics, 14, 91CrossRefPubMedPubMedCentralGoogle Scholar
  5. 5.
    Giorgi, F. M., Del Fabbro, C. and Licausi, F. (2013) Comparative study of RNA-seq-and microarray-derived coexpression networks in Arabidopsis thaliana. Bioinformatics, 29, 717–724CrossRefPubMedGoogle Scholar
  6. 6.
    Kanitz, A., Gypas, F., Gruber, A. J., Gruber, A. R., Martin, G. and Zavolan, M (2015) Comparative assessment of methods for the computational inference of transcript isoform abundance from RNA-seq data. Genome Biol., 16, 1–26CrossRefGoogle Scholar
  7. 7.
    Tourasse, N. J., Millet, J. R. M, and Dupuy, D. (2017) Quantitative RNA-seq meta-analysis of alternative exon usage in C. elegans. Genome Res., 27, 2120–2128CrossRefPubMedGoogle Scholar
  8. 8.
    Li, J. J., Huang, H., Qian, M. and Zhang, X. (2015) Advanced Medical Statistics, 2nd ed., chapter 24, pp. 915–936. World ScientificCrossRefGoogle Scholar
  9. 9.
    Seqc/Maqc-Iii Consortium (2014) A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium. Nat. Biotechnol., 32, 903–914CrossRefGoogle Scholar
  10. 10.
    Conesa, A., Madrigal, P., Tarazona, S., Gomez-Cabrero, D., Cervera, A., McPherson, A., Szcześniak, M. W., Gaffney, D. J., Elo, L. L., Zhang, X. et al. (2016) A survey of best practices for RNA-seq data analysis. Genome Biol., 17, 1CrossRefGoogle Scholar
  11. 11.
    Gao, R. and Li, J. J. (2017) Correspondence of D. melanogaster and C. elegans developmental stages revealed by alternative splicing characteristics of conserved exons. BMC Genomics, 18, 234CrossRefPubMedPubMedCentralGoogle Scholar
  12. 12.
    Arbeitman, M. N., Furlong, E. E. M., Imam, F., Johnson, E., Null, B. H., Baker, B. S., Krasnow, M. A., Scott, M. P., Davis, R. W. and White, K. P. (2002) Gene expression during the life cycle of Drosophila melanogaster. Science, 297, 2270–2275CrossRefPubMedGoogle Scholar
  13. 13.
    Necsulea, A., Soumillon, M., Warnefors, M., Liechti, A., Daish, T., Zeller, U., Baker, J. C., Grützner, F. and Kaessmann, H. (2014) The evolution of lncRNA repertoires and expression patterns in tetrapods. Nature, 505, 635–640CrossRefPubMedGoogle Scholar
  14. 14.
    Li, W. V., Chen, Y. and Li, J. J. (2017) Trom: a testing-based method for finding transcriptomic similarity of biological samples. Stat. Biosci., 9, 105–136CrossRefPubMedGoogle Scholar
  15. 15.
    de la Fuente, A., Bing, N., Hoeschele, I. and Mendes, P. (2004) Discovery of meaningful associations in genomic data using partial correlation coefficients. Bioinformatics, 20, 3565–3574CrossRefPubMedGoogle Scholar
  16. 16.
    Wyner, A. D. (1978) A definition of conditional mutual information for arbitrary ensembles. Inf. Control, 38, 51–59CrossRefGoogle Scholar
  17. 17.
    Zhao, J., Zhou, Y., Zhang, X. and Chen, L. (2016) Part mutual information for quantifying direct associations in networks. Proc. Natl. Acad. Sci. USA, 113, 5130–5135CrossRefPubMedGoogle Scholar
  18. 18.
    van der Maaten, L. and Hinton, G. (2008) Visualizing data using t-SNE. J. Mach. Learn. Res., 9, 2579–2605Google Scholar
  19. 19.
    Kruskal, J. B. and Wish, M. (1978) Multidimensional Scaling, volume 11. SageGoogle Scholar
  20. 20.
    Evans, C., Hardin, J. and Stoebel, D. M. (2017) Selecting between-sample RNA-seq normalization methods from the perspective of their assumptions. Brief. Bioinform., bbx008Google Scholar
  21. 21.
    Bullard, J. H., Purdom, E., Hansen, K. D. and Dudoit, S. (2010) Evaluation of statistical methods for normalization and differential expression in mRNA-seq experiments. BMC Bioinformatics, 11, 94CrossRefPubMedPubMedCentralGoogle Scholar
  22. 22.
    Mortazavi, A., Williams, B. A., McCue, K., Schaeffer, L. and Wold, B. (2008) Mapping and quantifying mammalian transcriptomes by RNA-seq. Nat. Methods, 5, 621–628CrossRefPubMedGoogle Scholar
  23. 23.
    Trapnell, C., Pachter, L. and Salzberg, S. L. (2009) Tophat: discovering splice junctions with RNA-seq. Bioinformatics, 25, 1105–1111CrossRefPubMedPubMedCentralGoogle Scholar
  24. 24.
    Li, B. and Dewey, C. N. (2011) RSEM: accurate transcript quantification from RNA-seq data with or without a reference genome. BMC Bioinformatics, 12, 323CrossRefPubMedPubMedCentralGoogle Scholar
  25. 25.
    Wagner, G. P., Kin, K. and Lynch, V. J. (2012) Measurement of mRNA abundance using RNA-seq data: RPKM measure is inconsistent among samples. Theory Biosci., 131, 281–285CrossRefPubMedGoogle Scholar
  26. 26.
    Dillies, M.-A., Rau, A., Aubert, J., Hennequet-Antier, C., Jeanmougin, M., Servant, N., Keime, C., Marot, G., Castel, D., Estelle, J., et al. (2013) A comprehensive evaluation of normalization methods for illumina high-throughput RNA sequencing data analysis. Brief. Bioinform., 14, 671–683CrossRefPubMedGoogle Scholar
  27. 27.
    Bolstad, B. M., Irizarry, R. A., Astrand, M. and Speed, T. P. (2003) A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics, 19, 185–193CrossRefPubMedGoogle Scholar
  28. 28.
    Anders, S. and Huber, W. (2010) Differential expression analysis for sequence count data. Genome Biol., 11, R106CrossRefPubMedPubMedCentralGoogle Scholar
  29. 29.
    Robinson, M. D. and Oshlack, A. (2010) A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol., 11, R25CrossRefPubMedPubMedCentralGoogle Scholar
  30. 30.
    Li, J., Witten, D. M., Johnstone, I. M. and Tibshirani, R. (2012) Normalization, testing, and false discovery rate estimation for RNA-sequencing data. Biostatistics, 13, 523–538CrossRefPubMedGoogle Scholar
  31. 31.
    Rapaport, F., Khanin, R., Liang, Y., Pirun, M., Krek, A., Zumbo, P., Mason, C. E., Socci, N. D. and Betel, D. (2013) Comprehensive evaluation of differential gene expression analysis methods for RNA-seq data. Genome Biol., 14, 3158CrossRefGoogle Scholar
  32. 32.
    Bloom, J. S., Khan, Z., Kruglyak, L., Singh, M. and Caudy, A. A. (2009) Measuring differential gene expression by short read sequencing: quantitative comparison to 2-channel gene expression microarrays. BMC Genomics, 10, 221CrossRefPubMedPubMedCentralGoogle Scholar
  33. 33.
    Robinson, M. D., McCarthy, D. J. and Smyth, G. K. (2010) edger: a bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics, 26, 139–140CrossRefPubMedGoogle Scholar
  34. 34.
    Hardcastle, T. J. and Kelly, K. A. (2010) baySeq: Empirical Bayesian methods for identifying differential expression in sequence count data. BMC Bioinformatics, 11, 422CrossRefPubMedPubMedCentralGoogle Scholar
  35. 35.
    Love, M. I., Huber, W. and Anders, S. (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol., 15, 550CrossRefPubMedPubMedCentralGoogle Scholar
  36. 36.
    Yu, D., Huber, W. and Vitek, O. (2013) Shrinkage estimation of dispersion in negative binomial models for RNA-seq experiments with small sample size. Bioinformatics, 29, 1275–1282CrossRefPubMedPubMedCentralGoogle Scholar
  37. 37.
    Leng, N., Dawson, J. A., Thomson, J. A., Ruotti, V., Rissman, A. I., Smits, B. M. G., Haag, J. D., Gould, M. N., Stewart, R. M. and Kendziorski, C. (2013) Ebseq: an empirical Bayes hierarchical model for inference in RNA-seq experiments. Bioinformatics, 29, 1035–1043CrossRefPubMedPubMedCentralGoogle Scholar
  38. 38.
    Van DeWiel, M. A., Leday, G. G. R., Pardo, L., Rue, H., Van Der Vaart, A. W. and Van Wieringen, W. N. (2013) Bayesian analysis of RNA sequencing data by estimating multiple shrinkage priors. Biostatistics, 14, 113–128CrossRefGoogle Scholar
  39. 39.
    Law, C. W., Chen, Y., Shi, W. and Smyth, G. K. (2014) voom: precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biol., 15, R29CrossRefPubMedPubMedCentralGoogle Scholar
  40. 40.
    Smyth, G. K. (2005) Limma: linear models for microarray data. In Bioinformatics and Computational Biology Solutions Using R and Bioconductor, pp. 397–420. SpringerCrossRefGoogle Scholar
  41. 41.
    Pimentel, H., Bray, N. L., Puente, S., Melsted, P. and Pachter, L. (2017) Differential analysis of RNA-seq incorporating quantification uncertainty. Nat. Methods, 14, 687–690CrossRefPubMedGoogle Scholar
  42. 42.
    Schurch, N. J., Schofield, P., Gierliński, M., Cole, C., Sherstnev, A., Singh, V., Wrobel, N., Gharbi, K., Simpson, G. G., Owen-Hughes, T., et al. (2016) How many biological replicates are needed in an RNA-seq experiment and which differential expression tool should you use? RNA, 22, 839–851CrossRefPubMedPubMedCentralGoogle Scholar
  43. 43.
    Neyman, J. and Pearson, E. S. (1928) On the use and interpretation of certain test criteria for purposes of statistical inference: Part I. Biometrika, 20, 175–240Google Scholar
  44. 44.
    Holm, S. (1979) A simple sequentially rejective multiple test procedure. Scand. J. Stat., 6, 65–70Google Scholar
  45. 45.
    Benjamini, Y. and Hochberg, Y. (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. B, 57, 289–300Google Scholar
  46. 46.
    Nueda, M. J., Martorell-Marugan, J., Martí, C., Tarazona, S. and Conesa, A. (2018) Identification and visualization of differential isoform expression in RNA-seq time series. Bioinformatics, 34, 524–526CrossRefPubMedGoogle Scholar
  47. 47.
    Tai, Y. C. and Speed, T. P. (2006) A multivariate empirical Bayes statistic for replicated microarray time course data. Ann. Stat., 34, 2387–2412CrossRefGoogle Scholar
  48. 48.
    Stuart, J. M., Segal, E., Koller, D.and Kim, S. K. (2003) A genecoexpression network for global discovery of conserved genetic modules. Science, 302, 249–255CrossRefPubMedGoogle Scholar
  49. 49.
    Langfelder, P. and Horvath, S. (2008) WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics, 9, 559CrossRefPubMedPubMedCentralGoogle Scholar
  50. 50.
    Zhang, B. and Horvath, S. (2005) A general framework for weighted gene co-expression network analysis. Stat. Appl. Genet. Mol. Biol., 4, Article 17Google Scholar
  51. 51.
    Ravasz, E., SomeraA. L., Mongru, D. A., Oltvai, Z. N. and Barabási, A.-L. (2002) Hierarchical organization of modularity in metabolic networks. Science, 297, 1551–1555CrossRefPubMedGoogle Scholar
  52. 52.
    Oti, M., van Reeuwijk, J., Huynen, M. A. and Brunner, H. G. (2008) Conserved co-expression for candidate disease gene prioritization. BMC Bioinformatics, 9, 208CrossRefPubMedPubMedCentralGoogle Scholar
  53. 53.
    Segal, E., Shapira, M., Regev, A., Pe’er, D., Botstein, D., Koller, D. and Friedman, N. (2003) Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data. Nat. Genet., 34, 166–176CrossRefPubMedGoogle Scholar
  54. 54.
    Canzar, S., Andreotti, S., Weese, D., Reinert, K. and Klau, G. W. (2016) CIDANE: comprehensive isoform discovery and abundance estimation. Genome Biol., 17, 16CrossRefPubMedPubMedCentralGoogle Scholar
  55. 55.
    Jiang, H. and Wong, W. H. (2009) Statistical inferences for isoform expression in RNA-seq. Bioinformatics, 25, 1026–1032CrossRefPubMedPubMedCentralGoogle Scholar
  56. 56.
    Trapnell, C., Williams, B. A., Pertea, G., Mortazavi, A., Kwan, G., van Baren, M. J., Salzberg, S. L., Wold, B. J. and Pachter, L. (2010) Transcript assembly and quantification by RNA-seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat. Biotechnol., 28, 511–515CrossRefPubMedPubMedCentralGoogle Scholar
  57. 57.
    Roberts, A. and Pachter, L. (2013) Streaming fragment assignment for real-time analysis of sequencing experiments. Nat. Methods, 10, 71–73CrossRefPubMedGoogle Scholar
  58. 58.
    Bray, N. L., Pimentel, H., Melsted, P. and Pachter, L. (2016) Nearoptimal probabilistic RNA-seq quantification. Nat. Biotechnol., 34, 525–527CrossRefPubMedGoogle Scholar
  59. 59.
    Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977) Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. B, 39, 1–38Google Scholar
  60. 60.
    Zhang, J., Jay Kuo, C.-C. and Chen, L. (2014) WEMIQ: an accurate and robust isoform quantification method for RNA-seq data. Bioinformatics, 31, 878–885CrossRefPubMedPubMedCentralGoogle Scholar
  61. 61.
    Patro, R., Duggal, G., Love, M. I., Irizarry, R. A. and Kingsford, C. (2017) Salmon provides fast and bias-aware quantification of transcript expression. Nat. Methods, 14, 417–419CrossRefPubMedPubMedCentralGoogle Scholar
  62. 62.
    Mezlini, A.M., Smith, E. J. M., Fiume, M., Buske, O., Savich, G. L., Shah, S., Aparicio, S., Chiang, D.Y., Goldenberg, A. and Brudno, M. (2013) iReckon: simultaneous isoform discovery and abundance estimation from RNA-seq data. Genome Res., 23, 519–529CrossRefPubMedPubMedCentralGoogle Scholar
  63. 63.
    Li, W. V., Zhao, A., Zhang, S. and Li, J. J. (2017) Msiq: joint modeling of multiple RNA-seq samples for accurate isoform quantification. Ann. Appl. Stat., 12, 510–539CrossRefGoogle Scholar
  64. 64.
    Katz, Y. and Eric, T. (2010) Analysis and design of RNA sequencing experiments for identifying isoform regulation. Nat. Methods, 7, 1009–1015CrossRefPubMedPubMedCentralGoogle Scholar
  65. 65.
    Love, M. I., Hogenesch, J. B. and Irizarry, R. A. (2016) Modeling of RNA-seq fragment sequence bias reduces systematic errors in transcript abundance estimation. Nat. Biotechnol., 34, 1287–1291CrossRefPubMedPubMedCentralGoogle Scholar
  66. 66.
    Roberts, A., Trapnell, C., Donaghey, J., Rinn, J. L. and Pachter, L. (2011) Improving RNA-seq expression estimates by correcting for fragment bias. Genome Biol., 12, R22CrossRefPubMedPubMedCentralGoogle Scholar
  67. 67.
    Xia, Z., Wen, J., Chang, C.-C. and Zhou, X. (2011) Nsmap: a method for spliced isoforms identification and quantification from RNA-seq. BMC Bioinformatics, 12, 162CrossRefPubMedPubMedCentralGoogle Scholar
  68. 68.
    Bohnert, R. and Rätsch, G. (2010) rQuant. web: a tool for RNAseq- based transcript quantitation. Nucleic Acids Res., 38, W348–W351CrossRefPubMedPubMedCentralGoogle Scholar
  69. 69.
    Li, J. J., Jiang, C.-R., Brown, J. B., Huang, H. and Bickel, P. J. (2011) Sparse linear modeling of next-generation mRNA sequencing (RNA-seq) data for isoform discovery and abundance estimation. Proc. Natl. Acad. Sci. USA, 108, 19867–19872CrossRefPubMedGoogle Scholar
  70. 70.
    Li, W., Feng, J. and Jiang, T. (2011) IsoLasso: a LASSO regression approach to RNA-seq based transcriptome assembly. J. Comput. Biol., 18, 1693–1707CrossRefPubMedPubMedCentralGoogle Scholar
  71. 71.
    Meinshausen, N. and Bühlmann, P. (2010) Stability selection. J. R. Stat. Soc. Series B Stat. Methodol., 72, 417–473CrossRefGoogle Scholar
  72. 72.
    Grabherr, M. G., Haas, B. J., Yassour, M., Levin, J. Z., Thompson, D. A., Amit, I., Adiconis, X., Fan, L., Raychowdhury, R., Zeng, Q., et al. (2011) Full-length transcriptome assembly from RNA-seq data without a reference genome. Nat. Biotechnol., 29, 644–652CrossRefPubMedPubMedCentralGoogle Scholar
  73. 73.
    Guttman, M., Garber, M., Levin, J. Z., Donaghey, J., Robinson, J., Adiconis, X., Fan, L., Koziol, M. J., Gnirke, A., Nusbaum, C., et al. (2010) Ab initio reconstruction of cell type-specific transcriptomes in mouse reveals the conserved multi-exonic structure of lincrnas. Nat. Biotechnol., 28, 503–510CrossRefPubMedPubMedCentralGoogle Scholar
  74. 74.
    Pertea, M., Pertea, G. M., Antonescu, C. M., Chang, T.-C., Mendell, J. T. and Salzberg, S. L. (2015) Stringtie enables improved reconstruction of a transcrip-tome from RNA-seq reads. Nat. Biotechnol., 33, 290–295CrossRefPubMedPubMedCentralGoogle Scholar
  75. 75.
    Wang, X., Wu, Z. and Zhang, X. (2010) Isoform abundance inference provides a more accurate estimation of gene expression levels in RNA-seq. J. Bioinform. Comput. Biol., 8 (Supp. 1), 177–192CrossRefPubMedGoogle Scholar
  76. 76.
    Lin, Y.-Y., Dao, P., Hach, F., Bakhshi, M., Mo, F., Lapuk, A., Collins, C. and Cenk Sahinalp, S. (2012) Cliiq: accurate comparative detection and quantification of expressed isoforms in a population. In Algorithms in Bioinformatics, pp. 178–189. SpringerCrossRefGoogle Scholar
  77. 77.
    Behr, J., Kahles, A., Zhong, Y., Sreedharan, V. T., Drewe, P. and Rätsch, G. (2013) MITIE: Simultaneous RNA-seq-based transcript identification and quantification in multiple samples. Bioinformatics, 29, 2529–2538CrossRefPubMedPubMedCentralGoogle Scholar
  78. 78.
    Bernard, E., Jacob, L., Mairal, J. and Vert, J.-P. (2014) Efficient RNA isoform identification and quantification from RNA-seq data with network flows. Bioinformatics, 30, 2447–2455CrossRefPubMedPubMedCentralGoogle Scholar
  79. 79.
    Steijger, T., Abril, J. F., Engström, P. G., Kokocinski, F., Abril, J. F., Akerman, M., Alioto, T., Ambrosini, G., Antonarakis, S. E., Behr, J., et al. (2013) Assessment of transcript reconstruction methods for RNA-seq. Nat. Methods, 10, 1177–1184CrossRefPubMedGoogle Scholar
  80. 80.
    Wu, J., Akerman, M., Sun, S., McCombie, W. R., Krainer, A. R. and Zhang, M. Q. (2011) Splicetrap: a method to quantify alternative splicing under single cellular conditions. Bioinformatics, 27, 3010–3016CrossRefPubMedPubMedCentralGoogle Scholar
  81. 81.
    Shen, S., Park, J. W., Lu, Z., Lin, L., Henry, M. D., Wu, Y. N., Zhou, Q. and Xing, Y. (2014) rMATS: robust and flexible detection of differential alternative splicing from replicate RNAseq data. Proc. Natl. Acad. Sci. USA., 111, E5593–E5601CrossRefPubMedGoogle Scholar
  82. 82.
    Hu, Y., Huang, Y., Du, Y., Orellana, C. F., Singh, D., Johnson, A. R., Monroy, A., Kuan, P.-F., Hammond, S. M., Makowski, L., et al. (2013) Diffsplice: the genome-wide detection of differential splicing events with RNA-seq. Nucleic Acids Res., 41, e39–e39CrossRefPubMedGoogle Scholar
  83. 83.
    Anders, S., Reyes, A. and Huber, W. (2012) Detecting differential usage of exons from RNA-seq data. Genome Res., 22, 2008–2017CrossRefPubMedPubMedCentralGoogle Scholar
  84. 84.
    Harrow, J., Frankish, A., Gonzalez, J. M., Tapanari, E., Diekhans, M., Kokocinski, F., Aken, B. L., Barrell, D., Zadissa, A., Searle, S., et al. (2012) GENCODE: the reference human genome annotation for the ENCODE project. Genome Res., 22, 1760–1774CrossRefPubMedPubMedCentralGoogle Scholar
  85. 85.
    Rhoads, A. and Au, K. F. (2015) Pacbio sequencing and its applications. Genom. Proteom. Bioinf., 13, 278–289CrossRefGoogle Scholar
  86. 86.
    Branton, D., Deamer, D. W., Marziali, A., Bayley, H., Benner, S. A., Butler, T., Di Ventra, M., Garaj, S., Hibbs, A., Huang, X., et al. (2008) The potential and challenges of nanopore sequencing. Nat. Biotechnol., 26, 1146–1153CrossRefPubMedPubMedCentralGoogle Scholar
  87. 87.
    Byrne, A., Beaudin, A. E., Olsen, H. E., Jain, M., Cole, C., Palmer, T., DuBois, R. M., Forsberg, E. C., Akeson, M. and Vollmers, C. (2017) Nanopore long-read RNA-seq reveals widespread transcriptional variation among the surface receptors of individual B cells. Nat. Commun., 8, 16027CrossRefPubMedPubMedCentralGoogle Scholar
  88. 88.
    Au, K. F., Sebastiano, V., Afshar, P. T., Durruthy, J. D. and Lee, L. Williams, B.A., van Bakel, H., Schadt, E. E., Reijo-Pera, R. A., Underwood, J.G., et al. (2013) Characterization of the human ESC transcriptome by hybrid sequencing. Proc. Natl. Acad. Sci. USA, 110, E4821–E4830CrossRefPubMedGoogle Scholar
  89. 89.
    Bleidorn, C. (2016) Third generation sequencing: technology and its potential impact on evolutionary biodiversity research. Syst. Biodivers., 14, 1–8CrossRefGoogle Scholar
  90. 90.
    Ramaswami, G., Lin, W., Piskol, R., Tan, M. H., Davis, C. and Li, J. B. (2012) Accurate identification of human Alu and non-Alu RNA editing sites. Nat. Methods, 9, 579–581CrossRefPubMedPubMedCentralGoogle Scholar
  91. 91.
    Bahn, J. H., Lee, J.-H., Li, G., Greer, C., Peng, G. and Xiao, X. (2012) Accurate identification of A-to-I RNA editing in human by transcriptome sequencing. Genome Res., 22, 142–150CrossRefPubMedPubMedCentralGoogle Scholar
  92. 92.
    Iyer, M. K., Niknafs, Y. S., Malik, R., Singhal, U., Sahu, A., Hosono, Y., Barrette, T. R., Prensner, J. R., Evans, J. R., Zhao, S., et al. (2015) The landscape of long noncoding RNAs in the human transcriptome. Nat. Genet., 47, 199–208CrossRefPubMedPubMedCentralGoogle Scholar
  93. 93.
    Hezroni, H., Koppstein, D., Schwartz, M. G., Avrutin, A., Bartel, D. P. and Ulitsky, I. (2015) Principles of long noncoding RNA evolution derived from direct comparison of transcriptomes in 17 species. Cell Reports, 11, 1110–1122CrossRefPubMedPubMedCentralGoogle Scholar
  94. 94.
    Pickrell, J. K., Marioni, J. C., Pai, A. A., Degner, J. F., Engelhardt, B. E., Nkadori, E., Veyrieras, J.-B., Stephens, M., Gilad, Y. and Pritchard, J. K. (2010) Understanding mechanisms underlying human gene expression variation with RNA sequencing. Nature, 464, 768–772.CrossRefPubMedPubMedCentralGoogle Scholar
  95. 95.
    Zak, D. E., Penn-Nicholson, A., Scriba, T. J., Thompson, E., Suliman, S., Amon, L. M., Mahomed, H., Erasmus, M., Whatney, W., Hussey, G. D., et al. (2016) A blood RNA signature for tuberculosis disease risk: a prospective cohort study. Lancet, 387, 2312–2322CrossRefPubMedPubMedCentralGoogle Scholar
  96. 96.
    Hawkins, R. D., Hon, G. C. and Ren, B. (2010) Next-generation genomics: an integrative approach. Nat. Rev. Genet., 11, 476–486CrossRefPubMedPubMedCentralGoogle Scholar
  97. 97.
    Kolodziejczyk, A. A., Kim, J. K., Svensson, V., Marioni, J. C. and Teichmann, S. A. (2015) The technology and biology of singlecell RNA sequencing. Mol. Cell, 58, 610–620CrossRefPubMedGoogle Scholar
  98. 98.
    Xu, C. and Su, Z. (2015) Identification of cell types from singlecell transcriptomes using a novel clustering method. Bioinformatics, 31, 1974–1980CrossRefPubMedGoogle Scholar
  99. 99.
    Pierson, E. and Yau, C. (2015) Zifa: dimensionality reduction for zero-inflated single-cell gene expression analysis. Genome Biol., 16, 241CrossRefPubMedPubMedCentralGoogle Scholar
  100. 100.
    Li, W. V. and Li, J. J. (2018) An accurate and robust imputation method scimpute for single-cell RNA-seq data. Nat. Commun., 9, 997CrossRefPubMedPubMedCentralGoogle Scholar
  101. 101.
    Regev, A., Teichmann, S.A., Lander, E.S., Amit, I., Benoist, C., Birney, E., Bodenmiller, B., Campbell, P., Carninci, P., Clatworthy, M., et al. (2017) The human cell atlas. eLife, 6, e27041Google Scholar
  102. 102.
    The Human Cell Atlas Consortium. (2017) The human cell atlas white paperGoogle Scholar

Copyright information

© Higher Education Press and Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of StatisticsUniversity of California, Los AngelesLos AngelesUSA
  2. 2.Department of Human GeneticsUniversity of California, Los AngelesLos AngelesUSA

Personalised recommendations