Integrating Multi-scale Gene Features for Cancer Diagnosis
Cancer is one of the major diseases that threaten human life. The advancement of high-throughput sequencing technology provides a way to accurately diagnose cancer and reveal the pathogenesis of cancer at the molecular level. In this study, we integrated the differentially expressed genes, and differential DNA methylation patterns, and applied multiple machine learning methods to conduct cancer diagnosis. The experimental results show that the performance of cancer diagnosis can be significantly improved with the integrated multi-scale gene features of RNA and epigenetic level. The AUC of classifier can be increased by 7.4% with multi-scale gene features compared to only differentially expressed genes, which verifies the effectiveness of the integration of multi-scale gene features for cancer diagnosis.
KeywordsCancer diagnosis Machine learning Gene expression DNA methylation High-Throughput sequencing technology
The project sponsored by the Scientific Research Foundation for the Returned Overseas Chinese Scholars, State Education Ministry (NO. 48, 2014-1685) and the Key Natural Science Project of Anhui Provincial Education Department (KJ2017A016).
- 6.Nakkeeran, R., Victoire, T.A.A.: Hybrid approach of data mining techniques, PCA, EDM and SVM for cancer gene feature selection and classification. J. Eur. J. Sci. Res. 79, 638–652 (2012)Google Scholar
- 10.Kulis, M., Esteller, M.: DNA methylation and cancer. J. Adv. Gene. 70, 27–56 (2010)Google Scholar
- 14.Ahn, S., Wang, T.: A powerful statistical method for identifying differentially methylated markers in complex diseases. J. Pac. Symp. Biocomput. 69–79 (2013). NIH Public AccessGoogle Scholar
- 20.Wang, Y., Teschendorff, A.E., Widschwendter, M., Wang, S.: Accounting for differential variability in detecting differentially methylated regions. J. Brief. Bioinform. (2017). bbx097Google Scholar
- 22.The Cancer Genome Atlas Research Network., Weinstein, J.N., et al.: The cancer genome atlas Pan-Cancer analysis project. J. Nat. Genet. 45(10), 1113–1120 (2013)Google Scholar