Analyzing Gene Pathways from Microarrays to Sequencing Platforms

  • Jeffrey MiecznikowskiEmail author
  • Dan Wang
  • Xing Ren
  • Jianmin Wang
  • Song Liu


Genetic microarrays have been the primary technology for quantitative transcriptome analysis since the mid-1990s. Via statistical testing methodology developed for microarray data, researchers can study genes and gene pathways involved in a disease. Recently a new technology known as RNA-seq has been developed to quantitatively study the transcriptome. This new technology can also study genes and gene pathways, although the statistical methodology used for microarrays must be adapted to this new platform. In this manuscript, we discuss methods of gene pathway analysis in microarrays and next generation sequencing and their advantages over standard “gene by gene” testing schemes.


Pathway analysis Microarrays RNA-Seq GSEA GSA Multiple testing 


  1. 1.
    Allison, D. B., Cui, X., Page, G. P., & Sabripour, M. (2006). Microarray data analysis: From disarray to consolidation and consensus. Nature Reviews Genetics, 7, 55–65.CrossRefGoogle Scholar
  2. 2.
    Efron, B., & Tibshirani, R. (2007). On testing the significance of sets of genes. The Annals of Applied Statistics, 1, 107–129.MathSciNetCrossRefGoogle Scholar
  3. 3.
    Hänzelmann, S., Castelo, R., & Guinney, J. (2013). GSVA: Gene set variation analysis for microarray and RNA-seq data. BMC Bioinformatics, 14, 7.CrossRefGoogle Scholar
  4. 4.
    Kanehisa, M., & Goto, S. (2000). KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Research, 28, 27–30.CrossRefGoogle Scholar
  5. 5.
    Langmead, B., Hansen, K. D., & Leek, J. T. (2010). Cloud-scale RNA-Sequencing differential expression analysis with Myrna. Genome Biology, 11, R83.CrossRefGoogle Scholar
  6. 6.
    Li, H., & Durbin, R. (2009). Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics, 25, 1754–1760.CrossRefGoogle Scholar
  7. 7.
    Li, H., Ruan, J., & Durbin, R. (2008). Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Research, 18, 1851–1858.CrossRefGoogle Scholar
  8. 8.
    Li, J., Witten, D. M., Johnstone, I. M., & Tibshirani, R. (2012). Normalization, testing, and false discovery rate estimation for RNA-Sequencing data. Biostatistics, 13, 523–538.CrossRefGoogle Scholar
  9. 9.
    Li, R., Yu, C., Li, Y., Lam, T. W., Yiu, S. M., Kristiansen, K., et al. (2009). SOAP2: An improved ultrafast tool for short read alignment. Bioinformatics, 25, 1966–1967.CrossRefGoogle Scholar
  10. 10.
    Mak, H. C., & Storey, J. D. (2011). Interview with nature biotechnology: New statistical methods for high-throughput sequencing. Nature Biotechnology, 29, 331–333.CrossRefGoogle Scholar
  11. 11.
    Marioni, J. C., Mason, C. E., Mane, S. M., Stephens, M., & Gilad, Y. (2008). RNA-Seq: An assessment of technical reproducibility and comparison with gene expression arrays. Genome Research, 18, 1509–1517.CrossRefGoogle Scholar
  12. 12.
    Miecznikowski, J. C., Liu, S., & Ren, X. (2012). Statistical modeling for differential transcriptome analysis using RNA-seq technology. Journal of Solid Tumors, 2, 33–44.CrossRefGoogle Scholar
  13. 13.
    Miecznikowski, J. C., Wang, D., Gold, D. L., & Liu, S. (2012). Meta-analysis of high throughput oncology data. In R. Chakraborty, C. R. Rao, & P. K. Sen (Eds.), Handbook of statistics: Bioinformatics in human health and heredity (pp. 67–96). Amsterdam: North Holland.Google Scholar
  14. 14.
    Miecznikowski, J. C., Wang, D., Liu, S., Sucheston, L., & Gold, D. (2010). Comparative survival analysis of breast cancer microarray studies identifies important prognostic genetic pathways. BMC Cancer, 10, 573.CrossRefGoogle Scholar
  15. 15.
    Mishra, G. R., Suresh, M., Kumaran, K., Kannabiran, N., Suresh, S., Bala, P., et al. (2006). Human protein reference database–2006 update. Nucleic Acids Research, 34, D411.CrossRefGoogle Scholar
  16. 16.
    Robinson, M. D., & Smyth, G. K. (2008). Small-sample estimation of negative binomial dispersion, with applications to SAGE data. Biostatistics, 9, 321–332.CrossRefGoogle Scholar
  17. 17.
    Rumble, S. M., Lacroute, P., Dalca, A. V., Fiume, M., Sidow, A., & Brudno, M. (2009). SHRiMP: Accurate mapping of short color-space reads. PLoS Computational Biology, 5, e1000386.CrossRefGoogle Scholar
  18. 18.
    Sanger, F., & Coulson, A. R. (1975). A rapid method for determining sequences in DNA by primed synthesis with DNA polymerase. Journal of Molecular Biology, 94, 441–448.CrossRefGoogle Scholar
  19. 19.
    Sanger, F., Nicklen, S., & Coulson, A. R. (1977). DNA sequencing with chain-terminating inhibitors. Proceedings of the National Academy of Sciences of the United States of America, 74, 5463–5467.Google Scholar
  20. 20.
    Subramanian, A., Tamayo, P., Mootha, V. K., Mukherjee, S., Ebert, B. L., Gillette, M. A., et al. (2005). Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. Proceedings of the National Academy of Sciences of the United States of America, 102, 15545–15550.Google Scholar
  21. 21.
    Varemo, L., Nielsen, J., & Nookaew, I. (2013). Enriching the gene set analysis of genome-wide data by incorporating directionality of gene expression and combining statistical hypotheses and methods. Nucleic Acids Research, 1, 14.Google Scholar
  22. 22.
    Young, M. D., Wakefield, M. J., Smyth, G. K., Oshlack, A., Young, M., Wakefield, M., et al. (2010). Gene ontology analysis for RNA-seq: Accounting for selection bias. Genome Biology, 11, R14.CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Jeffrey Miecznikowski
    • 1
    • 2
    Email author
  • Dan Wang
    • 2
  • Xing Ren
    • 1
  • Jianmin Wang
    • 2
  • Song Liu
    • 2
  1. 1.SUNY University at BuffaloDepartment of BiostatisticsBuffaloUSA
  2. 2.Roswell Park Comprehensive Cancer CenterBuffaloUSA

Personalised recommendations