Inference of Autism-Related Genes by Integrating Protein-Protein Interactions and miRNA-Target Interactions

  • Dang Hung TranEmail author
  • Thanh-Phuong Nguyen
  • Laura Caberlotto
  • Corrado Priami
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 244)


Autism spectrum disorders (ASD) are a group of conditions characterized by impairments in social interaction and presence of repetitive behavior. These complex neurological diseases are among the fastest growing developmental disorders and cause varying degrees of lifelong disabilities. There have been a lot of ongoing research to unravel the pathogenic mechanism of autism. Computational methods have come to the scene as a promising approach to aid the physicians in studying autism. In this paper, we present an efficient method to predict autism-related candidate genes (autism genes in short) by integrating protein interaction network and miRNA-target interaction network. We combine the two networks by a new technique relying on shortest path calculation. To demonstrate the high performance of our method, we run several experiments on three different PPI networks extracted from the BioGRID database, the HINT database, and the HPRD database. Three supervised learning algorithms were employed, i.e., the Bayesian network and the random tree and the random forest. Among them, the random forest method performs better in terms of precision, recall, and F-measure. It shows that the random forest algorithm is potential to infer autism genes. Carrying out the experiments with five different lengths of the shortest paths in the PPI networks, the results show the advantage of the method in studying autism genes based on the large scale network. In conclusion, the proposed method is beneficial in deciphering the pathogenic mechanism of autism.


Autism Spectrum Disorder Autism Spectrum Disorder Bayesian Network Random Forest Random Forest Algorithm 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Adie, E.J., Adams, R.R., Evans, K.L., Porteous, D.J., Picard, B.S.: Speeding disease gene discovery by sequence based candidate prioritization. BMC Bioinformatics 6, 55 (2005)Google Scholar
  2. 2.
    Barrett, T., Edgar, R.: Mining microarray data at NCBI’s Gene Expression Omnibus (GEO). Methods in Molecular Biology (Clifton, N.J.) 338, 175–190 (2006)Google Scholar
  3. 3.
    Bartel, D.P.: MicroRNAs: genomics, biogenesis, mechanism, and function. Cell 116(2), 281–297 (2004)CrossRefGoogle Scholar
  4. 4.
    Benjamin, S.B., Alex, B.: Protein interactions in human genetic diseases. Genome Biology 9(1), R9.1–R9.12 (2008)Google Scholar
  5. 5.
    Borgwardt, K.M., Kriegel, H.: Graph kernels for disease outcome prediction from protein-protein interaction networks. In: Pacific Symposium on Biocomputing, vol. 12, pp. 4–15. World Scientific Publishing Company, Singapore (2007)Google Scholar
  6. 6.
    Breiman, L.: Random forests. Machine Learning 45, 5–32 (2001)CrossRefzbMATHGoogle Scholar
  7. 7.
    Chan, A.W.S., Kocerha, J.: The path to microrna therapeutics in psychiatric and neurodegenerative disorders. Frontiers in Genetics 82(3), 1–10 (2012)Google Scholar
  8. 8.
    Chatr-aryamontri, A., Breitkreutz, B.-J., Heinicke, S., Boucher, L., Winter, A., Stark, C., Nixon, J., Ramage, L., Kolas, N., ODonnell, L., Reguly, T., Breitkreutz, A., Sellam, A., Chen, D., Chang, C., Rust, J., Livstone, M., Oughtred, R., Dolinski, K., Tyers, M.: The biogrid interaction database: 2013 update. Nucleic Acids Research (2012)Google Scholar
  9. 9.
    Das, J., Yu, H.: HINT: High-quality protein interactomes and their applications in understanding human disease. BMC Systems Biology 6(1), 92 (2012)CrossRefGoogle Scholar
  10. 10.
    Davis, A.P., Murphy, C.G., Johnson, R., Lay, J.M., Lennon-Hopkins, K., Saraceni-Richards, C., Sciaky, D., King, B.L., Rosenstein, M.C., Wiegers, T.C., Mattingly, C.J.: The Comparative Toxicogenomics Database: update 2013. Nucleic Acids Research 41(D1), D1104–D1114 (2013)Google Scholar
  11. 11.
    Goh, K.I., Cusick, M.E., Valle, D., Childs, B., Vidal, M., Barabasi, A.L.: The human disease network. Proceedings of the National Academy of Sciences 104(21), 8685–8690 (2007)CrossRefGoogle Scholar
  12. 12.
    Hamosh, A., Scott, A.F., Amberger, J.S., Bocchini, C.A., McKusick, V.A.: Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res. 33(Database Issue) (2005)Google Scholar
  13. 13.
    Ideker, T., Sharan, R.: Protein networks in disease. Genome Research 18(4), 644–652 (2008)CrossRefGoogle Scholar
  14. 14.
    Jiang, Q., Wang, Y., Hao, Y., Juan, L., Teng, M., Zhang, X., Li, M., Wang, G., Liu, Y.: Mir2disease: a manually curated database for microrna deregulation in human disease. Nucleic Acids Research 37(suppl. 1), D98–D104 (2009)Google Scholar
  15. 15.
    Kann, M.G.: Protein interactions and disease: computational approaches to uncover the etiology of diseases. Briefings in Bioinformatics 8(5), 333–346 (2007)CrossRefGoogle Scholar
  16. 16.
    Karni, S., Soreq, H., Sharan, R.: A network-based method for predicting disease-causing genes. Journal of Computational Biology 16(2), 181–189 (2009)CrossRefGoogle Scholar
  17. 17.
    Keshava Prasad, T.S., Goel, R., Kandasamy, K., Keerthikumar, S., Kumar, S., Mathivanan, S., Telikicherla, D., Raju, R., Shafreen, B., Venugopal, A., Balakrishnan, L., Marimuthu, A., Banerjee, S., Somanathan, D.S., Sebastian, A., Rani, S., Ray, S., Harrys Kishore, C.J., Kanth, S., Ahmed, M., Kashyap, M.K., Mohmood, R., Ramachandra, Y.L., Krishna, V., Abdul Rahiman, B., Mohan, S., Ranganathan, P., Ramabadran, S., Chaerkady, R., Pandey, A.: Human Protein Reference Database–2009 update. Nucleic Acids Research 37(Database Issue), D767–D772 (2009)Google Scholar
  18. 18.
    Krauthammer, M., Kaufmann, C.A., Gilliam, T.C., Rzhetsky, A.: Molecular triangulation: Bridging linkage and molecular-network information for identifying candidate genes in Alzheimer’s disease. PNAS 101(42), 15148–15153 (2004)CrossRefGoogle Scholar
  19. 19.
    Krol, J., Loedige, I., Filipowicz, W.: The widespread regulation of microRNA biogenesis, function and decay. Nature Reviews. Genetics 11(9), 597–610 (2010)Google Scholar
  20. 20.
    Lage, K., Karlberg, E.O., Størling, Z.M., Ólason, P.Í., Pedersen, A.G., Rigina, O., Hinsby, A.M., Tümer, Z., Pociot, F., Tommerup, N., Moreau, Y., Brunak, S.: A human phenome-interactome network of protein complexes implicated in genetic disorders. Nature Biotechnology 25(3), 309–316 (2007)Google Scholar
  21. 21.
    Lu, M., Zhang, Q., Deng, M., Miao, J., Guo, Y., Gao, W., Cui, Q.: An Analysis of Human MicroRNA and Disease Associations. PLoS One 3(10), e3420+ (2008)Google Scholar
  22. 22.
    Nguyen, T.-P., Ho, T.-B.: Detecting disease genes based on semi-supervised learning and protein-protein interaction networks. Artif. Intell. Med. 54(1), 63–71 (2012)CrossRefGoogle Scholar
  23. 23.
    Oti, M., Snel, B., Huynen, M.A., Brunner, H.G.: Predicting disease genes using protein-protein interactions. Journal of Medical Genetics 43, 691–698 (2006)CrossRefGoogle Scholar
  24. 24.
    Radivojac, P., Peng, K., Clark, W.T., Peters, B.J., Mohan, A., Boyle, S.M., Mooney, S.D.: An integrated approach to inferring gene-disease associations in humans. Proteins: Structure, Function, and Bioinformatics 72(3), 1030–1037 (2008)CrossRefGoogle Scholar
  25. 25.
    Sethupathy, P., Corda, B., Hatzigeorgiou, A.G.: TarBase: A comprehensive database of experimentally supported animal microRNA targets. RNA (New York, N.Y.) 12(2), 192–197 (2006)CrossRefGoogle Scholar
  26. 26.
    Tu, Z., Wang, L., Xu, M., Zhou, X., Chen, T., Sun, F.: Further understanding human disease genes by comparing with housekeeping genes and other genes. BMC Genomics 7, 31 (2006)Google Scholar
  27. 27.
    Turner, F.S., Clutterbuck, D.R., Semple, C.A.M.: Pocus: mining genomic sequence annotation to predict disease genes. Genome Biology 4, R75 (2003)Google Scholar
  28. 28.
    Witten, I.H., Eibe, F.: Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann Publishers Inc., San Fransisco (2005)Google Scholar
  29. 29.
    Wu, X., Jiang, R., Zhang, M.Q., Li, S.: Network-based global inference of human disease genes. Molecular Systems Biology 4 (May 2008)Google Scholar
  30. 30.
    Xu, J., Li, Y.: Discovering disease-genes by topological features in human protein-protein interaction network. Bioinformatics 22(22), 2800–2805 (2006)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Dang Hung Tran
    • 1
    Email author
  • Thanh-Phuong Nguyen
    • 2
  • Laura Caberlotto
    • 2
  • Corrado Priami
    • 2
    • 3
  1. 1.Hanoi National University of EducationHanoiVietnam
  2. 2.The Microsoft ResearchUniversity of Trento Centre for Computational Systems BiologyTrentoItaly
  3. 3.Department of MathematicsUniversity of TrentoTrentoItaly

Personalised recommendations