Abstract
With the flood of publicly available data, it allows scientists to explore and discover new findings. Gene expression is one type of biological data which captures the activity inside the cell. Studying gene expression data may expose the mechanisms of disease development. However, with the limitation of computing resources or knowledge in computer programming, many research groups are unable to effectively utilize the data. For about a decade now, various web-based data analysis tools have been developed to analyze gene expression data. Different tools were implemented by different analytical approaches, often resulting in different outcomes. This study conducts a comparative study of three existing web-based gene expression analysis tools, namely Gene-set Activity Toolbox (GAT), NetworkAnalyst and GEO2R using six publicly available cancer data sets. Results of our case study show that NetworkAnalyst has the best performance followed by GAT and GEO2R, respectively.
W. Engchuan and P. Patumcharoenpol–These authors contributed equally to this work.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Barrett, B.T., et al.: NCBI GEO: archive for functional genomics data sets—update. Nucleic Acids Res. 41(Database issue), D991–D995 (2013)
Kolesnikov, N., et al.: ArrayExpress update–simplifying data submissions. Nucleic Acids Res. 43(Database issue), D1113–D1116 (2015)
Petryszak, R., et al.: Expression Atlas update—a database of gene and transcript expression from microarray- and sequencing-based functional genomics experiments. Nucleic Acids Res. 42(Database issue), D926–D932 (2014)
Dudoit, S., et al.: Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments. Statistica Sinica12(1), 111–140 (2002)
Mootha, V.K., et al.: PGC-1alpha-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nat. Genet. 34, 267–273 (2003)
Sootanan, P., Prom-on, S., Meechai, A., Chan, J.H.: Pathway-based microarray analysis for robust disease classification. Neural Comput. Appl. 21, 649–660 (2012)
Engchuan, W., Chan, J.H.: Pathway activity transformation for multi-class classification of lung cancer dataset. Neurocomputing165, 81–89 (2015)
Doungpan, N., Engchuan, W., Meechai, A., Chan, J.H.: Clustering-based multi-class classification of complex disease. IJCNN 2015 (in press)
Saeed, A.I., et al.: TM4: a free, open-source system for microarray data management and analysis. Biotechniques 34, 374–378 (2003)
Huber, W., et al.: Orchestrating high-throughput genomic analysis with bioconductor. Nat. Methods 12, 115–121 (2015)
Herrero, J., Al-Shahrour, F., Diaz-Uriarte, R., Mateos, A., Vaquerizas, J.M., Santoyo, J., Dopazo, J.: GEPAS: a web-based resource for microarray gene expression data analysis. Nucleic Acid Res. 31, 3461–3467 (2003)
Hsiao, A., Ideker, T., Olefsky, J.M., Subramaniam, S.: VAMPIRE microarray suite: a web-based platform for the interpretation of gene expression data. Nucleic Acid Res. 33, W627–W632 (2005)
Xia, A.J., Gill, E.E., Hancock, R.E.W.: NetworkAnalyst for statistical, visual and network-based meta-analysis of gene expression data. Nat. Protoc. 10, 823–844 (2015)
Engchuan, W., Meechai, A., Tongsima, S., Chan, J.H.: Gene-set activity toolbox (GAT): a platform for microarray-based cancer diagnosis using an integrative gene-set analysis approach. http://www.gat.sit.kmutt.ac.th
Hong, Y., Ho, K.S., Eu, K.W., Cheah, P.Y.: A susceptibility gene set for early onset colorectal cancer that integrates diverse signaling pathways: implication for tumorigenesis. Clin. Cancer Res. 13, 1107–1114 (2007)
Sabates-Bellver, J., et al.: Transcriptome profile of human colorectal adenomas. Mol. Cancer Res. 5, 1263–1275 (2007)
Spira, A., et al.: Airway epithelial gene expression in the diagnostic evaluation of smokers with suspect lung cancer. Nat. Med. 13, 361–366 (2007)
Landi, M.T., et al.: Gene expression signature of cigarette smoking and its role in lung adenocarcinoma development and survival. PLoS ONE 3, 1651 (2008)
Turashvili, G., et al.: Novel markers for differentiation of lobular and ductal invasive breast carcinomas by laser microdissection and microarray analysis. BMC Cancer 7, 55 (2007)
Richardson, A.L., Wang, Z.C., De Nicolo, A., Lu, X., Brown, M., Miron, A., Liao, X., Iglehart, J.D., Livingston, D.M., Ganesan, S.: X chromosomal abnormalities in basal-like human breast cancer. Cancer Cell 9, 121–132 (2006)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. 11, 10–18 (2009)
Yu, W., Wulf, A., Liu, T., Khoury, M.J., Gwinn, M.: Gene prospector: an evidence gateway for evaluating potential susceptibility genes and interacting risk factors for human diseases. BMC Bioinform. 9, 528 (2008)
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20, 272–297 (1995)
Kotsiantis, S., Kanellopoulos, D., Pintelas, P.: Handling imbalanced dataset: a review. GESTS Int. Trans. ComSci. 30, 25–36 (2006)
Acknowledgment
The authors would like to thank Ms. Katlin Kreamer-Tonin for proofreading this paper.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Engchuan, W., Patumcharoenpol, P., Chan, J.H. (2015). Comparative Study of Web-Based Gene Expression Analysis Tools for Biomarkers Identification. In: Arik, S., Huang, T., Lai, W., Liu, Q. (eds) Neural Information Processing. ICONIP 2015. Lecture Notes in Computer Science(), vol 9491. Springer, Cham. https://doi.org/10.1007/978-3-319-26555-1_25
Download citation
DOI: https://doi.org/10.1007/978-3-319-26555-1_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-26554-4
Online ISBN: 978-3-319-26555-1
eBook Packages: Computer ScienceComputer Science (R0)