DEGnet: Identifying Differentially Expressed Genes Using Deep Neural Network from RNA-Seq Datasets
Differential expression (DE) analysis and identification of differentially expressed genes (DEGs) provide insights for discovery of therapeutic drugs and underlying mechanisms of disease. Statistical methods, such as DESeq2, edgeR, and limma-voom produce a number of false positives and false negatives and fail to differentiate between the DEGs as up-regulating (UR) and down-regulating (DR) genes linking them to disease progression. Machine learning (ML) including deep learning (DL) methods to identify DEGs from RNA-seq data face challenges due to smaller sample sizes (n) compared to number of genes (g). In this work, we propose a deep neural network (DNN) called DEGnet to predict the UR and DR genes from Parkinson’s disease (PD) and breast cancer (BRCA) RNA-seq datasets. The accuracies we obtained from PD and BRCA were 100% and 87.5% respectively, higher than ML-based methods on the same datasets. However, to the best of our knowledge, we are the first to apply DNN on for classification of DEGs into UR and DR, and identify significant UR and DR genes that play role in progression of a disease. Experimental results show that DEGnet is a good performer and can be applied in other RNA-seq data, despite the n \(<<\) g issue.
KeywordsDeep neural network RNA-seq Parkinson’s disease Breast cancer
- 10.Sarkar, M., Leong, T.-Y.: Application of k-nearest neighbors algorithm on breast cancer diagnosis problem. In: Proceedings of the AMIA Symposium, p. 759. American Medical Informatics Association (2000)Google Scholar
- 13.Singireddy, S., Alkhateeb, A., Rezaeian, I., Rueda, L., Cavallo-Medved, D., Porter, L.: Identifying differentially expressed transcripts associated with prostate cancer progression using RNA-seq and machine learning techniques. In: 2015 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), pp. 1–5. IEEE (2015)Google Scholar
- 14.Liaw, A., Wiener, M., et al.: Classification and regression by randomForest. R News 2(3), 18–22 (2002)Google Scholar
- 16.Tomczak, K., Czerwińska, P., Wiznerowicz, M.: The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge. Contemp. Oncol. 19(1A), A68 (2015)Google Scholar