Predicting Gene-Disease Associations with Manifold Learning

  • Ping Luo
  • Li-Ping Tian
  • Bolin Chen
  • Qianghua Xiao
  • Fang-Xiang WuEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10847)


In this study, we propose a manifold learning-based method for predicting disease genes by assuming that a disease and its associated genes should be consistent in some lower dimensional manifold. The 10-fold cross-validation experiments show that the area under of the receiver operating characteristic (ROC) curve (AUC) generated by our approach is 0.7452 with high-quality gene-disease associations in OMIM dataset, which is greater that of the competing method PBCF (0.5700). 9 out of top 10 predicted gene-disease associations can be supported by existing literature, which is better than the result (6 out of top 10 predicted association) of the PBCF. All these results illustrate that our method outperforms the competing method.



This work is supported in part by Natural Science and Engineering Research Council of Canada (NSERC), China Scholarship Council (CSC) and by the National Natural Science Foundation of China under Grant No. 61772552 and 61571052.


  1. 1.
    Glaab, E., Bacardit, J., Garibaldi, J.M., Natalio, K.: Using rule-based machine learning for candidate disease gene prioritization and sample classification of cancer gene expression data. PLoS ONE 7(7), e39932 (2012)CrossRefGoogle Scholar
  2. 2.
    Isakov, O., Dotan, I., Ben-Shachar, S.: Machine learningbased gene prioritization identifies novel candidate risk genes for inflammatory bowel disease. Inflamm. Bowel Dis. 23(9), 15161523 (2017)CrossRefGoogle Scholar
  3. 3.
    Mordelet, F., Vert, J.P.: Prodige: prioritization of disease genes with multitask machine learning from positive and unlabeled examples. BMC Bioinf. 12, 389 (2011)CrossRefGoogle Scholar
  4. 4.
    Chen, B., Wu, F.X.: Identifying protein complexes based on multiple topological structures in PPI networks. IEEE Trans. Nanobiosci. 12(3), 165–172 (2016)CrossRefGoogle Scholar
  5. 5.
    Chen, B., Li, M., Wang, J., Wu, F.X.: Disease gene identification by using graph kernels and Markov random fields. Sci. China Life Sci. 57(11), 1054–1063 (2014)CrossRefGoogle Scholar
  6. 6.
    Chen, B., Wang, J., Li, M., Wu, F.X.: Identifying disease genes by integrating multiple data sources. BMC Med. Genomics 7(2), S2 (2014)CrossRefGoogle Scholar
  7. 7.
    Chen, B., Li, M., Wang, J., Shang, X., Wu, F.X.: A fast and high performance multiple data integration algorithm for identifying human disease genes. BMC Med. Genomicse 8(3), S2 (2015)CrossRefGoogle Scholar
  8. 8.
    Chen, B., Shang, X., Li, M., Wang, J., Wu, F.X.: Identifying individual-cancer-related genes by rebalancing the training samples. IEEE Trans. Nanobiosci. 15(4), 309–315 (2016)CrossRefGoogle Scholar
  9. 9.
    Luo, P., Tian, L.P., Ruan, J., Wu, F.X.: Disease gene prediction by integrating PPI networks, clinical RNA-Seq data and OMIM data. IEEE/ACM Trans. Comput. Biol. Bioinf. (2017, in press).
  10. 10.
    Natarajan, N., Dhillon, I.S.: Inductive matrix completion for predicting genedisease associations. Bioinformatics 30(12), i60–i68 (2014)CrossRefGoogle Scholar
  11. 11.
    Zeng, X., Ding, N., Rodrguez-Patn, A., Zou, Q.: Probability-based collaborative filtering model for predicting genedisease associations. BMC Med. Genomics 10(S5), 76 (2017)CrossRefGoogle Scholar
  12. 12.
    Li, L., Wu, L., Zhang, H., Wu, F.X.: A fast algorithm for nonnegative matrix factorization and its convergence. IEEE Trans. Neural Netw. Learn. Syst. 25(10), 1855–1863 (2014)CrossRefGoogle Scholar
  13. 13.
    Li-Ping, T., Luo, P., Wang, H., Huiru, Z., Wu, F.X.: CASNMF: a converged algorithm for symmetrical nonnegative matrix factorization. Neurocomputing 275, 2031–2040 (2018)CrossRefGoogle Scholar
  14. 14.
    Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 15(6), 1373–1396 (2003)CrossRefGoogle Scholar
  15. 15.
    Ham, J., Lee, D.D., Saul, L.K.: Semisupervised alignment of manifolds. In: AISTATS, pp. 120–127 (2005)Google Scholar
  16. 16.
    Amberger, J.S., Bocchini, C.A., Schiettecatte, F., Scott, A.F., Hamosh, A.: OMIM. org: online mendelian inheritance in man (OMIM®), an online catalog of human genes and genetic disorders. Nucleic Acids Res. 43(D1), D789–D798 (2014)CrossRefGoogle Scholar
  17. 17.
    Jolliffe, I.T.: Principal Component Analysis. Springer, New York (2002). Scholar
  18. 18.
    Greenacre, M.J.: Theory and Applications of Correspondence Analysis. Academic Press, New York (1984)zbMATHGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Ping Luo
    • 1
  • Li-Ping Tian
    • 2
  • Bolin Chen
    • 3
  • Qianghua Xiao
    • 4
  • Fang-Xiang Wu
    • 1
    • 5
    • 6
    Email author
  1. 1.Division of Biomedical EngineeringUniversity of SaskatchewanSakatoonCanada
  2. 2.School of InformationBeijing Wuzi UniversityBeijingChina
  3. 3.School of Computer Science and TechnologyNorthwestern Polytechnical UniversityXi’anChina
  4. 4.School of Mathematics and PhysicsUniversity of South ChinaHengyangChina
  5. 5.School of Mathematical SciencesNankai UniversityTianjinChina
  6. 6.Department of Mechanical EngineeringUniversity of SaskatchewanSaskatoonCanada

Personalised recommendations