Skip to main content

Deep Learning for Learning Graph Representations

  • Chapter
  • First Online:
Deep Learning: Concepts and Architectures

Part of the book series: Studies in Computational Intelligence ((SCI,volume 866))

Abstract

Mining graph data has become a popular research topic in computer science and has been widely studied in both academia and industry given the increasing amount of network data in the recent years. However, the huge amount of network data has posed great challenges for efficient analysis. This motivates the advent of graph representation which maps the graph into a low-dimension vector space, keeping original graph structure and supporting graph inference. The investigation on efficient representation of a graph has profound theoretical significance and important realistic meaning, we therefore introduce some basic ideas in graph representation/network embedding as well as some representative models in this chapter.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 179.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Negative links exist in signed network, but only non-negative links are considered here.

  2. 2.

    Here the authors use sigmoid function \(\sigma (x)=\frac{1}{1+exp(-x)}\) as the non-linear activation function.

  3. 3.

    To simplify the notations, network representations \(Y^{(K)}=\{\mathbf {y}^{(K)}_i\}_{i=1}^n\) are denoted as \(Y=\{\mathbf {y}_i\}_{i=1}^n\) by the authors.

  4. 4.

    When the covariance matrices are not diagonal, Wang et al. propose a fast iterative algorithm (i.e., BADMM) to solve the Wasserstein distance [70].

  5. 5.

    Note that the first term \(p(\{\mathbf {f}(v): v\in V^{*} \} \mid \{\mathbf {h}(v): v\in V^{*} \})\) is maximized with \(\mathbf {f}(v) = \mathbf {g}(\mathbf {h}(v))\), and the maximum value of this probability density is a constant unrelated with \(\mathbf {h}(v)\). Hence we can focus on maximizing the second term first.

References

  1. Agarwal, S., Branson, K., Belongie S.: Higher order learning with graphs. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 17–24. ACM (2006)

    Google Scholar 

  2. Ambrosio, L., Gigli, N., Savaré, G.: Gradient Flows: in Metric Spaces and in the Space of Probability Measures. Springer Science & Business Media (2008)

    Google Scholar 

  3. Ba, J.L. Kiros, J.R., Hinton, G.E.: Layer normalization (2016). arXiv preprint arXiv:1607.06450

  4. Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 15(6), 1373–1396 (2003)

    Article  Google Scholar 

  5. Belkin, M., Niyogi, P., Sindhwani, V.: Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J. Mach. Learn. Res. 7, 2399–2434 (2006)

    MathSciNet  MATH  Google Scholar 

  6. Bengio, Y.: Learning deep architectures for AI. Found. Trends® Mach. Learn. 2(1), 1–127 (2009)

    Article  MathSciNet  Google Scholar 

  7. Bonacich, P.: Some unique properties of eigenvector centrality. Soc. Netw. 29(4), 555–564 (2007)

    Article  Google Scholar 

  8. Bonneel, N., Rabin, J., Peyré, G., Pfister, H.: Sliced and radon wasserstein barycenters of measures. J. Math. Imaging Vis. 51(1), 22–45 (2015)

    Article  MathSciNet  Google Scholar 

  9. Bonneel, N., Van De Panne, M., Paris, S., Heidrich, W.: Displacement interpolation using lagrangian mass transport. ACM Trans. Graph. (TOG) 30, 158. ACM (2011)

    Article  Google Scholar 

  10. Bryant, V.: Metric Spaces: Iteration and Application. Cambridge University Press (1985)

    Google Scholar 

  11. Cao, S., Lu, W., Xu, Q.: Grarep: learning graph representations with global structural information. In: CIKM ’15, pp. 891–900. ACM, New York (2015)

    Google Scholar 

  12. Chen, C., Tong, H.: Fast eigen-functions tracking on dynamic graphs. In: Proceedings of the 2015 SIAM International Conference on Data Mining, pp. 559–567. SIAM (2015)

    Google Scholar 

  13. Clement, P., Desch, W.: An elementary proof of the triangle inequality for the wasserstein metric. Proc. Am. Math. Soc. 136(1), 333–339 (2008)

    Article  MathSciNet  Google Scholar 

  14. Courty, N., Flamary, R., Ducoffe, M.: Learning wasserstein embeddings (2017). arXiv preprint arXiv:1710.07457

  15. Courty, N., Flamary, R., Tuia, D., Rakotomamonjy, A.: Optimal transport for domain adaptation. IEEE Trans. Pattern Anal. Mach. Intell. 39(9), 1853–1865 (2017)

    Article  Google Scholar 

  16. Cuturi, M., Doucet, A.: Fast computation of wasserstein barycenters. In: International Conference on Machine Learning, pp. 685–693 (2014)

    Google Scholar 

  17. Cybenko, G.: Approximation by superpositions of a sigmoidal function. Math. Control Signal Syst. 2(4), 303–314 (1989)

    Article  MathSciNet  Google Scholar 

  18. Dash, N.S.: Context and contextual word meaning. SKASE J. Theor. Linguist. 5(2), 21–31 (2008)

    Google Scholar 

  19. De Goes, F., Breeden, K., Ostromoukhov, V., Desbrun, M.: Blue noise through optimal transport. ACM Trans. Graph. (TOG) 31(6), 171 (2012)

    Google Scholar 

  20. Delalleau, O., Bengio, Y., Roux, N.L.: Efficient non-parametric function induction in semi-supervised learning. In: AISTATS ’05, pp. 96–103 (2005)

    Google Scholar 

  21. Doersch, C.: Tutorial on variational autoencoders (2016). arXiv preprint arXiv:1606.05908

  22. Dreyfus, S.: The numerical solution of variational problems. J. Math. Anal. Appl. 5(1), 30–45 (1962)

    Article  MathSciNet  Google Scholar 

  23. Eom, Y.-H., Jo, H.-H.: Tail-scope: using friends to estimate heavy tails of degree distributions in large-scale complex networks. Sci. Rep. 5 (2015)

    Google Scholar 

  24. Erhan, D., Bengio, Y., Courville, A., Manzagol, P.-A., Vincent, P., Bengio, S.: Why does unsupervised pre-training help deep learning? J. Mach. Learn. Res. 11, 625–660 (2010)

    MathSciNet  MATH  Google Scholar 

  25. Givens, C.R., Shortt, R.M., et al.: A class of wasserstein metrics for probability distributions. Mich. Math. J. 31(2), 231–240 (1984)

    Article  MathSciNet  Google Scholar 

  26. Glorot, X., Bordes, A., Bengio,Y.: Deep sparse rectifier neural networks. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pp. 315–323 (2011)

    Google Scholar 

  27. Grover, A., Leskovec, J.: Node2vec: scalable feature learning for networks. In: KDD ’16, pp. 855–864. ACM, New York (2016)

    Google Scholar 

  28. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition (2015). arXiv preprint arXiv:1512.03385

  29. Hinton, G., Deng, L., Yu, D., Dahl, G.E., Mohamed, A.-R., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T.N., et al.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process. Mag. 29(6), 82–97 (2012)

    Article  Google Scholar 

  30. Hinton, G.E., Osindero, S., Teh, Y.-W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)

    Article  MathSciNet  Google Scholar 

  31. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  32. Holland, P.W., Leinhardt, S.: Holland and Leinhardt reply: some evidence on the transitivity of positive interpersonal sentiment (1972)

    Article  Google Scholar 

  33. Hornik, K.: Approximation capabilities of multilayer feedforward networks. Neural Netw. 4(2), 251–257 (1991)

    Article  MathSciNet  Google Scholar 

  34. Jamali, M., Ester, M.: A matrix factorization technique with trust propagation for recommendation in social networks. In: Proceedings of the Fourth ACM Conference on Recommender Systems, pp. 135–142. ACM (2010)

    Google Scholar 

  35. Jin, E.M., Girvan, M., Newman, M.E.: Structure of growing social networks. Phys. Rev. E 64(4), 046132 (2001)

    Article  Google Scholar 

  36. Kingma, D., Ba, J.: Adam: a method for stochastic optimization (2014). arXiv preprint arXiv:1412.6980

  37. Kolouri, S., Park, S.R., Thorpe, M., Slepcev, D., Rohde, G.K.: Optimal mass transport: signal processing and machine-learning applications. IEEE Signal Process. Mag. 34(4), 43–59 (2017)

    Article  Google Scholar 

  38. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

    Google Scholar 

  39. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)

    Article  Google Scholar 

  40. LeCun, Y., Chopra, S., Hadsell, R., Ranzato, M., Huang, F.: A tutorial on energy-based learning. Predict. Struct. Data 1 (2006)

    Google Scholar 

  41. Leicht, E.A., Holme, P., Newman, M.E.: Vertex similarity in networks. Phys. Rev. E 73(2), 026120 (2006)

    Article  Google Scholar 

  42. Liben-Nowell, D., Kleinberg, J.: The link-prediction problem for social networks. J. Am. Soc. Inf. Sci. Technol. 58(7), 1019–1031 (2007)

    Article  Google Scholar 

  43. Luo, D., Nie, F., Huang, H., Ding, C.H.: Cauchy graph embedding. In: Proceedings of the 28th International Conference on Machine Learning (ICML-11), pp. 553–560 (2011)

    Google Scholar 

  44. Ma, J., Cui, P., Zhu, W.: Depthlgp: learning embeddings of out-of-sample nodes in dynamic networks. In: AAAI, pp. 370–377 (2018)

    Google Scholar 

  45. Mikolov, T., Karafiát, M., Burget, L., Černockỳ, J., Khudanpur, S.: Recurrent neural network based language model. In: Eleventh Annual Conference of the International Speech Communication Association, pp. 1045–1048 (2010)

    Google Scholar 

  46. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)

    Google Scholar 

  47. Nathan, E., Bader, D.A.: A dynamic algorithm for updating katz centrality in graphs. In: Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017, pp. 149–154. ACM (2017)

    Google Scholar 

  48. Ou, M., Cui, P., Pei, J., Zhang, Z., Zhu, W.: Asymmetric transitivity preserving graph embedding. In: Proceedings of ACM SIGKDD, pp. 1105–1114 (2016)

    Google Scholar 

  49. Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: bringing order to the web. Technical report, Stanford InfoLab (1999)

    Google Scholar 

  50. Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: online learning of social representations. In: SIGKDD, pp. 701–710. ACM (2014)

    Google Scholar 

  51. Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning). The MIT Press (2005)

    Google Scholar 

  52. Rossi, R.A., Ahmed, N.K.: Role discovery in networks. IEEE Trans. Knowl. Data Eng. 27(4), 1112–1131 (2015)

    Article  Google Scholar 

  53. Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Neurocomputing: foundations of research. In: Learning Representations by Back-Propagating Errors, pp. 696–699. MIT Press, Cambridge (1988)

    Google Scholar 

  54. Salakhutdinov, R., Hinton, G.: Semantic hashing. Int. J. Approx. Reason. 50(7), 969–978 (2009)

    Article  Google Scholar 

  55. Shaw, B., Jebara, T.: Structure preserving embedding. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 937–944. ACM (2009)

    Google Scholar 

  56. Siegelmann, H.T., Sontag, E.D.: On the computational power of neural nets. J. Comput. Syst. Sci. 50(1), 132–150 (1995)

    Article  MathSciNet  Google Scholar 

  57. Smola, A.J., Kondor, R.: Kernels and Regularization on Graphs, pp. 144–158. Springer, Berlin (2003)

    Chapter  Google Scholar 

  58. Socher, R., Perelygin, A., Wu, J.Y., Chuang, J., Manning, C.D., Ng, A.Y., Potts, C.: Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), vol. 1631, pp. 1642. Citeseer (2013)

    Google Scholar 

  59. Stoyanov, V., Ropson, A., Eisner, J.: Empirical risk minimization of graphical model parameters given approximate inference, decoding, and model structure. In: AISTATS’11, Fort Lauderdale, April 2011

    Google Scholar 

  60. Sun, L., Ji, S., Ye, J.: Hypergraph spectral learning for multi-label classification. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 668–676. ACM (2008)

    Google Scholar 

  61. Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., Mei, Q.: Line: large-scale information network embedding. In: Proceedings of the 24th International Conference on World Wide Web, pp. 1067–1077. International World Wide Web Conferences Steering Committee (2015)

    Google Scholar 

  62. Tenenbaum, J.B., De Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290(5500), 2319–2323 (2000)

    Article  Google Scholar 

  63. Tian, F., Gao, B., Cui, Q., Chen, E., Liu, T.-Y.: Learning deep representations for graph clustering. In: Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, pp. 1293–1299 (2014)

    Google Scholar 

  64. Tolstikhin, I., Bousquet, O., Gelly, S., Schoelkopf, B.: Wasserstein auto-encoders (2017). arXiv preprint arXiv:1711.01558

  65. Tu, K., Cui, P., Wang, X., Wang, F., Zhu, W.: Structural deep embedding for hyper-networks. In: Thirty-Second AAAI Conference on Artificial Intelligence, pp. 426–433 (2018)

    Google Scholar 

  66. Tu, K., Cui, P., Wang, X., Yu, P.S., Zhu, W.: Deep recursive network embedding with regular equivalence. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2357–2366. ACM (2018)

    Google Scholar 

  67. Vilnis, L., McCallum, A.: Word representations via Gaussian embedding (2014). arXiv preprint arXiv:1412.6623

  68. Vishwanathan, S.V.N., Schraudolph, N.N., Kondor, R., Borgwardt, K.M.: Graph kernels. J. Mach. Learn. Res. 11, 1201–1242 (2010)

    MathSciNet  MATH  Google Scholar 

  69. Wang, D., Cui, P., Zhu, W.: Structural deep network embedding. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1225–1234. ACM (2016)

    Google Scholar 

  70. Wang, H., Banerjee, A.: Bregman alternating direction method of multipliers. In: Advances in Neural Information Processing Systems, pp. 2816–2824 (2014)

    Google Scholar 

  71. Werbos, P.J.: Backpropagation through time: what it does and how to do it. Proc. IEEE 78(10), 1550–1560 (1990)

    Article  Google Scholar 

  72. Zang, C., Cui, P., Faloutsos, P., Zhu, W.: Long short memory process: modeling growth dynamics of microscopic social connectivity. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 565–574. ACM (2017)

    Google Scholar 

  73. Zhu, D., Cui, P., Wang, D., Zhu, W.: Deep variational network embedding in wasserstein space. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2827–2836. ACM (2018)

    Google Scholar 

  74. Zhu, X., Ghahramani, Z.: Learning from labeled and unlabeled data with label propagation. Technical report (2002)

    Google Scholar 

  75. Zhu, X., Ghahramani, Z., Lafferty, J.: Semi-supervised learning using Gaussian fields and harmonic functions. In: ICML’03, pp. 912–919. AAAI Press (2003)

    Google Scholar 

  76. Zhuang, J., Tsang, I.W., Hoi S.: Two-layer multiple kernel learning. In: International Conference on Artificial Intelligence and Statistics, pp. 909–917 (2011)

    Google Scholar 

Download references

Acknowledgements

We thank Ke Tu (DRNE and DHNE), Daixin Wang (SDNE), Dingyuan Zhu (DVNE) and Jianxin Ma (DepthLGP) for providing us with valuable materials. Xin Wang is the corresponding author. This work is supported by China Postdoctoral Science Foundation No. BX201700136, National Natural Science Foundation of China Major Project No. U1611461 and National Program on Key Basic Research Project No. 2015CB352300.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wenwu Zhu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Zhu, W., Wang, X., Cui, P. (2020). Deep Learning for Learning Graph Representations. In: Pedrycz, W., Chen, SM. (eds) Deep Learning: Concepts and Architectures. Studies in Computational Intelligence, vol 866. Springer, Cham. https://doi.org/10.1007/978-3-030-31756-0_6

Download citation

Publish with us

Policies and ethics