Abstract
Feature selection technique is often applied in identifying cancer prognosis biomarkers. However, many feature selection methods are prone to over-fitting or poor biological interpretation when applied on biological high-dimensional data. Network-based feature selection and data integration approaches are proposed to identify more robust biomarkers. We conducted experiments to investigate the advantages of the two approaches using epithelial mesenchymal transition regulatory network, which is demonstrated as highly relevant to cancer prognosis. We obtained data from The Cancer Genome Atlas. Prognosis prediction was made using Support Vector Machine. Under our experimental settings, the results showed that network-based features gave significantly more accurate predictions than individual molecular features, and features selected from integrated data (RNA-Seq and micro-RNA data) gave significantly more accurate predictions than features selected from single source data (RNA-Seq data). Our study indicated that biological network-based feature transformation and data integration are two useful approaches to identify robust cancer biomarkers.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
We applied the lasso function implemented in MATLAB R2015a to select the feature set that has the minimum mean squared error.
References
Ludwig, J.A., Weinstein, J.N.: Biomarkers in cancer staging, prognosis and treatment selection. Nat. Rev. cancer 5(11), 845–856 (2005)
Hanash, S.M., Pitteri, S.J., Faca, V.M.: Mining the plasma proteome for cancer biomarkers. Nature 452(7187), 571–579 (2008)
Saeys, Y., Inza, I., Larraaga, P.: A review of feature selection techniques in bioinformatics. Bioinformatics 23(19), 2507–2517 (2007)
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)
Thousands of Samples are Needed to Generate a Robust Gene List for Predicting Outcome in Cancer, vol. 103. National Academy Sciences (2006)
Haury, A.-C., Gestraud, P., Vert, J.-P.: The influence of feature selection methods on accuracy, stability and interpretability of molecular signatures. PloS One 6(12), e28210 (2011)
Patel, V.N., Gokulrangan, G., Chowdhury, S.A., Chen, Y., Sloan, A.E., Koyutrk, M., Barnholtz-Sloan, J., Chance, M.R.: Network signatures of survival in glioblastoma multiforme. PLoS Comput. Biol. 9(9), e1003237 (2013)
Dao, P., Colak, R., Salari, R., Moser, F., Davicioni, E., Schönhuth, A., Ester, M.: Inferring cancer subnetwork markers using density-constrained biclustering. Bioinformatics 26(18), i625–i631 (2010)
Clarke, R., Ressom, H.W., Zhang, Y., Xuan, J.: Module-based breast cancer classification. Int. J. Data Min. Bioinform. 7, 284–302 (2013)
Holzinger, E.R., Li, R., Pendergrass, S.A., Kim, D., Ritchie, M.D.: Methods of integrating data to uncover genotype-phenotype interactions. Nat. Rev. Genet. 16, 85–97 (2015)
Kim, D., Shin, H., Song, Y.S., Kim, J.H.: Synergistic effect of different levels of genomic data for cancer clinical outcome prediction. J. Biomed. Inform. 45(6), 1191–1198 (2012)
Huang, H.-L., Wu, Y.-C., Su, L.-J., Huang, Y.-J., Charoenkwan, P., Chen, W.-Li., Lee, H.-C., Chu, W.C.-C., Ho, S.-Y.: Discovery of prognostic biomarkers for predicting lung cancer metastasis using microarray and survival data. BMC Bioinform. 16(1) (2015)
Zhao, Q., Shi, X., Xie, Y., Huang, J., Shia, B.C., Ma, S.: Combining multidimensional genomic measurements for predicting cancer prognosis: observations from TCGA. Briefings Bioinform. 16(2), 291–303 (2015)
Schliekelman, M.J., Taguchi, A., Zhu, J., Dai, X., Rodriguez, J., Celiktas, M., Zhang, Q., Chin, A., Wong, C.-H., Wang, H., et al.: Molecular portraits of epithelial, mesenchymal, and hybrid states in lung adenocarcinoma and their relevance to survival. Cancer Res. 75(9), 1789–1800 (2015)
Chaffer, C.L., Weinberg, R.A.: A perspective on cancer cell metastasis. Science 331(6024), 1559–1564 (2011)
Elsevier. EMT as the Ultimate Survival Mechanism of Cancer Cells, vol. 22 (2012)
Derynck, R., Lamouille, S., Xu, J.: Molecular mechanisms of epithelial-mesenchymal transition. Nat. Rev. Mol. Cell Biol. 15, 178–196 (2014)
Kalluri, R., Weinberg, R.A.: The basics of epithelial-mesenchymal transition. J. Clin. Invest. 119(6), 1420–1428 (2009)
Amin, E.M., Oltean, S., Hua, J., Gammons, M.V.R., Hamdollah-Zadeh, M., Welsh, G.I., Cheung, M.-K., Ni, L., Kase, S., Rennel, E.S., Symonds, K.E., Nowak, D.G., Royer-Pokora, B., Saleem, M.A., Hagiwara, M., Schumacher, V.A., Harper, S.J., Hinton, D.R., Bates, D.O., Ladomery, M.R.: WT1 mutants reveal SRPK1 to be a downstream angiogenesis target by altering VEGF splicing. Cancer Cell 20(6), 768–780 (2011)
Berx, G., De Craene, B.: Regulatory networks defining EMT during cancer initiation and progression. Nat. Rev. Cancer 13(6), 97–110 (2013)
Ji, Y., Zhu, Y., Qiu, P.: TCGA-Assembler: open-source software for retrieving and processing TCGA data. Nat. Methods 11, 599–600 (2014)
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc.: Ser. B (Methodol.) 58, 267–288 (1996)
Wernicke, S., Rasche, F.: FANMOD: a tool for fast network motif detection. Bioinformatics 22(9), 1152–1153 (2006)
Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2(3), 27 (2011)
World Scientific. Integrative Network Analysis to Identify Aberrant Pathway Networks in Ovarian Cancer (2012)
Acknowledgment
This study was funded by the German Ministry of Research and Education (BMBF) Project Grant 3FO18501 (Forschungscampus MODAL).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix
Appendix
EMT network interactions were given in Table 4.
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Shao, B., Conrad, T. (2016). Epithelial-Mesenchymal Transition Regulatory Network-Based Feature Selection in Lung Cancer Prognosis Prediction. In: Ortuño, F., Rojas, I. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2016. Lecture Notes in Computer Science(), vol 9656. Springer, Cham. https://doi.org/10.1007/978-3-319-31744-1_13
Download citation
DOI: https://doi.org/10.1007/978-3-319-31744-1_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-31743-4
Online ISBN: 978-3-319-31744-1
eBook Packages: Computer ScienceComputer Science (R0)