Abstract
Single gene analysis looks to a single gene at a time and its relation to a specific phenotype such as cancer development. However, pathway analysis simplifies the analysis by focusing on group of genes at a time that involve in the same biological process. Pathway analysis has useful applications such as discovering diseases, diseases prevention and drug development. Different data mining approaches can be applied in pathway analysis. In this paper, we overview different pathway analysis techniques in analyzing gene expression and propose a classification for them. Pathway analysis can be classified into: detecting significant pathways and discovering new pathways. In addition, we summarize different data mining techniques that are used in pathway analysis.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Biological pathways fact sheet (2014). http://www.genome.gov/27530687. Accessed 11 August 2014
Pathguide (2015). http://www.pathguide.org/. Accessed 02 January 2015
Pathway analysis (2014). http://www.genexplain.com/pathway-analysis. Accessed 08 November 2014
Getting started with RNA-seq data analysis (2011). http://www.illumina.com/documents/products/datasheets/datasheet_rnaseq_analysis.pdf
Transitioning from microarrays to mRNA-seq, December 2011. http://www.illumina.com/content/dam/illumina-marketing/documents/icommunity/article_2011_12_ea_rna-seq.pdf
American cancer society: cancer facts and figures 2014 (2014)
Carugo, O., Eisenhaber, F.: Data Mining Techniques for the Life Sciences. Springer, New York (2010)
Chen, Y., Chen, H.I., Huang, Y.: Mapping miRNA regulation to functional gene sets. In: International Joint Conference on Bioinformatics, Systems Biology and Intelligent Computing, IJCBS 2009, pp. 122–125. IEEE (2009)
Fridley, B.L., Jenkins, G.D., Grill, D.E., Kennedy, R.B., Poland, G.A., Oberg, A.L.: Soft truncation thresholding for gene set analysis of RNA-seq data: application to a vaccine study. Sci. Rep. 3, 2898 (2013)
Hänzelmann, S., Castelo, R., Guinney, J.: GSVA: gene set variation analysis for microarray and RNA-seq data. BMC Bioinf. 14(1), 7 (2013)
Ibrahim, M.H., Jassim, S., Cawthorne, M., Langlands, K.: Pathway-based gene selection for disease classification. In: 2011 International Conference on Information Society (i-Society), pp. 360–365. IEEE (2011)
Jiang, D., Tang, C., Zhang, A.: Cluster analysis for gene expression data: a survey. IEEE Trans. Knowl. Data Eng. 16(11), 1370–1386 (2004)
Jones, N.C., Pevzner, P.: An Introduction to Bioinformatics Algorithms. MIT press, Cambridge (2004)
Jungjit, S., Michaelis, M., Freitas, A.A., Cinatl, J.: Extending multi-label feature selection with KEGG pathway information for microarray data analysis. In: 2014 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology, pp. 1–8. IEEE (2014)
Khatri, P., Sirota, M., Butte, A.J.: Ten years of pathway analysis: current approaches and outstanding challenges. PLoS Comput. Biol. 8(2), e1002375 (2012)
Kozielski, M., Gruca, A.: Soft approach to identification of cohesive clusters in two gene representations. Procedia Comput. Sci. 35, 281–289 (2014)
Milone, D.H., Stegmayer, G., López, M., Kamenetzky, L., Carrari, F.: Improving clustering with metabolic pathway data. BMC Bioinf. 15(1), 101 (2014)
Misman, M., Deris, S., Hashim, S., Jumali, R., Mohamad, M.: Pathway-based microarray analysis for defining statistical significant phenotype-related pathways: a review of common approaches. In: International Conference on Information Management and Engineering, ICIME 2009, April 2009, pp. 496–500 (2009)
Misman, M.F., Mohamad, M.S., Deris, S., Abdullah, A., Hashim, S.Z.M.: An improved hybrid of SVM and SCAD for pathway analysis. Bioinformation 7(4), 169 (2011)
Pang, H., Lin, A., Holford, M., Enerson, B.E., Lu, B., Lawton, M.P., Floyd, E., Zhao, H.: Pathway analysis using random forests classification and regression. Bioinformatics 22(16), 2028–2036 (2006)
Pang, H., Zhao, H.: Building pathway clusters from random forests classification using class votes. BMC Bioinf. 9(1), 87 (2008)
Panteris, E., Swift, S., Payne, A., Liu, X.: Mining pathway signatures from microarray data and relevant biological knowledge. J. Biomed. Inf. 40(6), 698–706 (2007)
Shin, M., Kim, J.: Data mining and knowledge discovery in real life applications. In: Microarray Data Mining for Biological Pathway Analysis, pp. 319–336. I-Tech (2009)
Viswanathan, G.A., Seto, J., Patil, S., Nudelman, G., Sealfon, S.C.: Getting started in biological pathway construction and analysis. PLoS Comput. Biol. 4(2), e16 (2008)
Wang, N., Wang, Y., Yang, Y., Shen, Y., Li, A.: miRNA target prediction based on gene ontology. In: 2013 Sixth International Symposium on Computational Intelligence and Design (ISCID), vol. 1, pp. 430–433. IEEE (2013)
Wang, X., Cairns, M.J.: Gene set enrichment analysis of RNA-seq data: integrating differential expression and splicing. BMC Bioinf. 14(Suppl. 5), S16 (2013)
Xiong, Q., Mukherjee, S., Furey, T.S.: GSAASeqSP: a toolset for gene set association analysis of RNA-seq data. Sci. Rep. 4, 6347 (2014)
Zhang, C., Li, C., Li, J., Han, J., Shang, D., Zhang, Y., Zhang, W., Yao, Q., Han, L., Xu, Y., Yan, W., Bao, Z., You, G., Jiang, T., Kang, C., Li, X.: Identification of miRNA-mediated core gene module for glioma patient prediction by integrating high-throughput miRNA, mRNA expression and pathway structure. PLoS ONE 9(5), e96908 (2014)
Zhang, W., Emrich, S., Zeng, E.: A two-stage machine learning approach for pathway analysis. In: 2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), December 2010, pp. 274–279 (2010)
Zhao, X., Zhong, S., Zuo, X., Lin, M., Qin, J., Luan, Y., Zhang, N., Liang, Y., Rao, S.: Pathway-based analysis of the hidden genetic heterogeneities in cancers. Genomics, Proteomics Bioinf. 12(1), 31–38 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
AlAjlan, A., Badr, G. (2015). Data Mining in Pathway Analysis for Gene Expression. In: Perner, P. (eds) Advances in Data Mining: Applications and Theoretical Aspects. ICDM 2015. Lecture Notes in Computer Science(), vol 9165. Springer, Cham. https://doi.org/10.1007/978-3-319-20910-4_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-20910-4_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-20909-8
Online ISBN: 978-3-319-20910-4
eBook Packages: Computer ScienceComputer Science (R0)