A Hybrid VAE Based Network Embedding Method for Biomedical Relation Mining

Abstract

Mining biomedical entity association and extracting the implicit knowledge from biomedical entity relation networks are important for precision medicine. In this paper, we propose a novel method for implicit relation mining from biomedical multi-entity network. In the embedding part, we combine two kinds of model (1) the graph representation learning model like GraphGAN and (2) the network embedding model like VAE based SDNE, to construct a hybrid model GVS. In the prediction part, the positive samples selected from original network and the negative samples generated by ranking meta-paths are used to train kNN. To evaluate the performances of GVS, we compare the proposed method with three state-of-the-art methods (Katz, Catapult and IMC) on benchmark datasets. Moreover, we evaluate GVS on a real biomedical entity relation network, it shows advantages compared with other network embedding methods and successfully mines implicit relationships which validated by PubMed.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

References

  1. 1.

    Janani S, Ramyachitra D, Ranjani RR (2018) PCD-DPPI: protein complex detection from dynamic PPI using shuffled frog-leaping algorithm. Gene Rep 12:89–98

    Article  Google Scholar 

  2. 2.

    SabziNezhad A, Jalili S (2020) DPCT: a dynamic method for detecting protein complexes from TAP-aware weighted PPI network. Front Genet 11:567

    Article  Google Scholar 

  3. 3.

    Kim S, Liu H, Yeganova L, Wilbur WJ (2015) Extracting drug–drug interactions from literature using a rich feature-based linear kernel approach. J Biomed Inform 55:23–30

    Article  Google Scholar 

  4. 4.

    Wang W, Yang X, Yang C, Guo X, Zhang X, Wu C (2017) Dependency-based long short term memory network for drug–drug interaction extraction. BMC Bioinform 18(Suppl 16):578

    Article  Google Scholar 

  5. 5.

    Le DH, Tran TTH (2018) autoHGPE: automated prediction of novel disease-gene and disease-disease associations and evidence collection based on a random walk on heterogeneous network. F1000 Res 7(658):658

    Article  Google Scholar 

  6. 6.

    Ding P, Luo J, Liang C, Xiao Q, Cao B (2018) Human disease MiRNA inference by combining target information based on heterogeneous manifolds. J Biomed Inform 80:26–36

    Article  Google Scholar 

  7. 7.

    Huang L, Wang Y, Wang Y, Bai T (2016) Gene-disease interaction retrieval from multiple sources: a network based method. Biomed Res Int 2016:1–9

    Google Scholar 

  8. 8.

    Xiao Q, Luo P, Li M, Wang J, Wu FX (2019) A novel core-attachment based method to identify dynamic protein complexes based on gene expression profiles and PPI networks. Proteomics 19(5):1800129

    Article  Google Scholar 

  9. 9.

    Bai T, Gong L, Wang Y, Wang Y, Kulikowski CA (2016) A method for exploring implicit concept relatedness in biomedical knowledge network. BMC Bioinform 17(9):53–66

    Google Scholar 

  10. 10.

    Yu F, Yang Z, Hu X, Sun Y, Hong F, Wang J (2015) Protein complex detection in PPI networks based on data integration and supervised learning method. BMC Bioinform 16(Suppl 12):S3

    Article  Google Scholar 

  11. 11.

    Quan C, Lei H, Sun X, Bai W (2016) Multichannel convolutional neural network for biological relation extraction. Biomed Res Int 2016:1–10

    Google Scholar 

  12. 12.

    Bai T, Wang C, Wang Y, Huang L, Xing F (2020) A novel deep learning method for extracting unspecific biomedical relation. Concurrency Comput Pract Exp. https://doi.org/10.1002/cpe.5005

    Article  Google Scholar 

  13. 13.

    Goyal P, Ferrara E (2018) Graph embedding techniques, applications, and performance: a survey. Knowl Based Syst 151:78–94

    Article  Google Scholar 

  14. 14.

    Liu X, Yang Z, Sang S, Zhou Z, Wang L, Zhang Y et al (2018) Identifying protein complexes based on node embeddings obtained from protein-protein interaction networks. BMC Bioinform 19(1):332

    Article  Google Scholar 

  15. 15.

    Grover A, Leskovec J (2016) node2vec: Scalable Feature Learning for Networks. KDD: proceedings. International Conference on Knowledge Discovery & Data Mining. pp 855–864. https://doi.org/10.1145/2939672.2939754

  16. 16.

    Yang K, Wang N, Liu G et al (2018) Heterogeneous network embedding for identifying symptom candidate genes. J Am Med Inform Assoc (JAMIA) 25(11):1452–1459

    Article  Google Scholar 

  17. 17.

    Wang H, Wang J, Wang J, Zhao M, Zhang W, Zhang F et al (2018) GraphGAN: graph representation learning with generative adversarial nets. IEEE Trans Knowl Data Eng

  18. 18.

    Wang D, Cui P, Zhu W (2016) Structural deep network embedding. In: Proceedings of the 22nd ACM SIGKDD international conference. ACM, pp 1225–1234

  19. 19.

    Kingma DP, Welling M (2013) Auto-encoding variational bayes

  20. 20.

    Singh-Blom UM, Natarajan N, Tewari A, Woods JO, Dhillon IS, Marcotte EM (2013) Prediction and validation of gene-disease associations using methods inspired by social network analyses. PLoS ONE. https://doi.org/10.1371/annotation/5aeb88a0-1630-4a07-bb49-32cb5d617af1

    Article  Google Scholar 

  21. 21.

    Sun Y, Han J, Yan X et al (2011) PathSim: meta path-based Top-K similarity search in heterogeneous information networks. Proc VLDB Endow 4(11):992–1003

    Article  Google Scholar 

  22. 22.

    Nagarajan N, Dhillon IS (2014) Inductive matrix completion for predicting gene–disease associations. Bioinformatics 30(12):i60–i68

    Article  Google Scholar 

  23. 23.

    Bai T, Ge Y, Yang C et al (2019) BERST: an engine and tool for exploring biomedical entities and relationships. Chin J Electron 28(4):797–804

    Article  Google Scholar 

  24. 24.

    Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326

    Google Scholar 

  25. 25.

    Perozzi B, Al-Rfou R, Skiena S (2014) DeepWalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, pp 701–710

  26. 26.

    Zhuge B, Li G (2017) MiR-150 deficiency ameliorated hepatosteatosis and insulin resistance in nonalcoholic fatty liver disease via targeting casp8 and fadd-like apoptosis regulator. Biochem Biophys Res Commun 494(3–4):687–692

    Article  Google Scholar 

  27. 27.

    Jagannathan L, Jose CC, Tanwar VS et al (2017) Identification of a unique gene expression signature in mercury and 2, 3, 7, 8-tetrachlorodibenzo-p-dioxin co-exposed cells. Toxicol Res 6(3):312–323

    Article  Google Scholar 

Download references

Acknowledgements

This work is supported by the Development Project of Jilin Province of China (Nos. 20200801033GH,YDZJ202101ZYTS128), Jilin Provincial Key Laboratory of Big Data Intelligent Computing (No.20180622002JC), The Fundamental Research Funds for the Central University, JLU.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Lan Huang.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Bai, T., Li, Y., Wang, Y. et al. A Hybrid VAE Based Network Embedding Method for Biomedical Relation Mining. Neural Process Lett (2021). https://doi.org/10.1007/s11063-021-10454-5

Download citation

Keywords

  • Network embedding
  • Variational auto-encoder
  • Biomedical relation mining
  • k-nearest neighbor