Protein Sequence Based Anomaly Detection for Neuro-Degenerative Disorders Through Deep Learning Techniques
Exploring the effects of genetic information in causing potential brain disorders like Alzheimer’s disease (AD) and Parkinson’s disease (PD) is a relatively unexplored field. The aim of this investigation was to employ computational techniques at predicting anomalies that cause neuro-degenerative brain disorders with improved accuracy at an enhanced pace by analysis of gene and protein sequence data. The proposed methodology employed deep learning techniques to determine anomaly causing genes that played a significant role in causing potential brain disorders. The results revealed that deep learning models exhibit improved performance compared to conventional machine learning models, in identifying the optimal genes that cause neuro-degenerations.
KeywordsAlzheimer’s disease Parkinson’s disease Autoencoders Anomaly detection
This research work is a part of the Science and Engineering Research Board (SERB), Department of Science and Technology (DST) funded project under Young Scientist Scheme—Early Start-up Research Grant- titled “Investigation on the effect of Gene and Protein Mutants in the onset of Neuro-Degenerative Brain Disorders (Alzheimer’s and Parkinson’s disease): A Computational Study” with Reference No- SERB—YSS/2015/000737.
- 4.Escudero, J., Ifeachor, E., Zajicek, J.P., Green, C., Shearer, J., Pearson, S.: Machine learning-based method for personalized and cost –effective detection of Alzheimer’s disease. IEEE Trans. Biomed. Eng. 60(1), 164–168 (2013). https://doi.org/10.1109/tbme.2012.2212278
- 6.Rabeh, A.B., Benzarti, F., Amiri, H.: Diagnosis of alzheimer diseases in early step using SVM (Support Vector Machine). In: 13th International Conference on Computer Graphics, Imaging and Visualization, pp. 364–367. IEEE computer society, Morocco (2016). https://doi.org/10.1109/cgiv.2016.76
- 8.Taccioli, C., Tegnér, J., Maselli, V., et al.: ParkDB: a Parkinson’s disease gene expression database. Database. Article ID bar007, 2011. https://doi.org/10.1093/database/bar007
- 11.Gene Card Database. Available: www.genecards.org
- 12.GeneSet Enrichment Analysis Data: Alzheimer GeneSet. Available: http://software.broadinstitute.org/gsea/msigdb/cards/KEGG_ALZHEIMERS_DISEASE.html
- 13.GeneSet Enrichment Analysis Data: Parkinson GeneSet. Available: http://software.broadinstitute.org/gsea/msigdb/cards/KEGG_PARKINSONS_DISEASE.html
- 14.Universal Protein Resource: Available: www.uniprot.org. Accessed 20 Jan 2018
- 15.Rao, H.B., Zhu, F., Yang, G.B., Li, Z.R., Chen, Y.Z.: Update of PROFEAT: a web server for computing structural and physicochemical features of proteins and peptides from amino acid sequence. Nucleic Acids Res. 39(Web Server issue), W385–90 (2011). https://doi.org/10.1093/nar/gkr284
- 16.Lyudchik, O, Vlimant, J.R., Pierini, M.: Outlier detection using Autoencoders. CERN non-member state summer student report 2016 (2016)Google Scholar
- 17.Zhai, S., Cheng, Y., Lu, W., Zhang, Z.: Deep structured energy based models for anomaly detection. In: International Conference on Machine Learning, New York (2016). arXiv:1605.07717v2