Skip to main content

Graph Clustering Based on Attribute-Aware Graph Embedding

  • Chapter
  • First Online:
From Security to Community Detection in Social Networking Platforms (ASONAM 2017)

Part of the book series: Lecture Notes in Social Networks ((LNSN))

Abstract

Graph clustering is a fundamental problem in graph mining and network analysis. To group vertices of a graph into a series of densely knitted clusters with each cluster being well-separated from all the others, classic methods primarily consider the mere graph structure information in modeling and quantifying the proximity or distance of vertices for graph clustering. However, with the proliferation of rich, heterogeneous attribute information widely available in real-world graphs, such as user profiles in social networks, and GO (Gene Ontology) terms in protein interaction networks, it becomes essential to combine both structure and attribute information of graphs towards yielding better-quality clusters. In this chapter, we propose a new graph embedding approach for attributed graph clustering. We embed each vertex of a graph into a continuous vector space within which the local structure and attribute information surrounding the vertex can be jointly encoded in a unified, latent representation. Specifically, we quantify the vertex-wise attribute proximity into edge weights and leverage a group of truncated, attribute-aware random walks to learn the latent representations of vertices. This way, the challenging attributed graph clustering problem can be cast into the traditional problem of multidimensional data clustering, which has admitted efficient and cost-effective solutions. We apply our attribute-aware graph embedding algorithm in a series of real-world and synthetic attributed graphs and networks. The experimental studies demonstrate that our proposed method significantly outperforms the state-of-the-art attributed graph clustering techniques in terms of both clustering effectiveness and efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www-personal.umich.edu/~mejn/netdata.

  2. 2.

    http://dblp.uni-trier.de/xml/.

  3. 3.

    http://www.nber.org/patents.

References

  1. Akbas, E., Zhao, P.: Attributed graph clustering: an attribute-aware graph embedding approach. In: Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017 (ASONAM’17), pp. 305–308. ACM, New York (2017), http://doi.acm.org/10.1145/3110025.3110092

  2. Akoglu, L., Tong, H., Meeder, B., Faloutsos, C.: PICS: parameter-free identification of cohesive subgroups in large attributed graphs. In: Proceedings of the Twelfth SIAM International Conference on Data Mining, Anaheim (SDM’12), pp. 439–450. Society for Industrial and Applied Mathematics, Philadelphia (2012)

    Google Scholar 

  3. Andersen, R., Chung, F., Lang, K.: Local graph partitioning using pagerank vectors. In: Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS’06), pp. 475–486. IEEE, Piscataway (2006)

    Google Scholar 

  4. Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 15(6), 1373–1396 (2003)

    Article  Google Scholar 

  5. Boden, B., Haag, R., Seidl, T.: Detecting and exploring clusters in attributed graphs: a plugin for the gephi platform. In: Proceedings of the 22nd ACM International Conference on Information & Knowledge Management (CIKM’13), pp. 2505–2508. ACM, New York (2013)

    Google Scholar 

  6. Bothorel, C., Cruz, J.D., Magnani, M., Micenkova, B.: Clustering attributed graphs: models, measures and methods. Netw. Sci. 3, 408–444 (2015)

    Article  Google Scholar 

  7. Cannataro, M., Guzzi, P.H., Veltri, P.: Protein-to-protein interactions: technologies, databases, and algorithms. ACM Comput. Surv. 43(1), 1:1–1:36 (2010)

    Google Scholar 

  8. Cao, S., Lu, W., Xu, Q.: Grarep: Learning graph representations with global structural information. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management (CIKM’15), pp. 891–900. ACM, New York (2015)

    Google Scholar 

  9. Dourisboure, Y., Geraci, F., Pellegrini, M.: Extraction and classification of dense communities in the web. In: Proceedings of the 16th International Conference on World Wide Web (WWW’07), pp. 461–470. ACM, New York (2007)

    Google Scholar 

  10. Fortunato, S.: Community detection in graphs. Phys. Rep. 486(3–5), 75–174 (2010)

    Article  MathSciNet  Google Scholar 

  11. Gong, N.Z., Xu, W., Huang, L., Mittal, P., Stefanov, E., Sekar, V., Song, D.: Evolution of social-attribute networks: measurements, modeling, and implications using Google+. In: Proceedings of the 2012 ACM Conference on Internet Measurement Conference (IMC’12), pp. 131–144. ACM, New York (2012)

    Google Scholar 

  12. He, X., Ding, C.H.Q., Zha, H., Simon, H.D.: Automatic topic identification using webpage clustering. In: Proceedings of the 2001 IEEE International Conference on Data Mining (ICDM’01), pp. 195–202. IEEE, Piscataway (2001)

    Google Scholar 

  13. Henderson, K., Eliassi-Rad, T., Papadimitriou, S., Faloutsos, C.: HCDF: a hybrid community discovery framework. In: Proceedings of the SIAM International Conference on Data Mining (SDM’10), pp. 754–765. Society for Industrial and Applied Mathematics, Philadelphia (2010)

    Google Scholar 

  14. Hu, A.L., Chan, K.C.C.: Utilizing both topological and attribute information for protein complex identification in PPI networks. IEEE/ACM Trans. Comput. Biol. Bioinform. 10(3), 780–792 (2013)

    Article  Google Scholar 

  15. Kim, M., Leskovec, J.: Multiplicative attribute graph model of real-world networks. Internet Math. 8(1–2), 113–160 (2012)

    Article  MathSciNet  Google Scholar 

  16. Lattanzi, S., Sivakumar, D.: Affiliation networks. In: Proceedings of the Forty-first Annual ACM Symposium on Theory of Computing (STOC’09), pp. 427–434. ACM, New York (2009)

    Google Scholar 

  17. Li, R., Wang, C., Chang, K.C.C.: User profiling in an ego network: co-profiling attributes and relationships. In: Proceedings of the 23rd International Conference on World Wide Web (WWW’14), pp. 819–830. ACM, New York (2014)

    Google Scholar 

  18. Liu, L., Xu, L., Wangy, Z., Chen, E.: Community detection based on structure and content: a content propagation perspective. In: 2015 IEEE International Conference on Data Mining, pp. 271–280. IEEE, Piscataway (2015)

    Google Scholar 

  19. Macropol, K., Singh, A.: Scalable discovery of best clusters on large graphs. Proc. VLDB Endow. 3(1–2), 693–702 (2010)

    Article  Google Scholar 

  20. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: 27th Annual Conference on Neural Information Processing Systems (NIPS’13), pp. 3111–3119 (2013)

    Google Scholar 

  21. Mnih, A., Hinton, G.E.: A scalable hierarchical distributed language model. In: Proceedings of the Twenty-Second Annual Conference on Neural Information Processing Systems (NIPS’08), pp. 1081–1088 (2008)

    Google Scholar 

  22. Perozzi, B., Akoglu, L., Iglesias Sánchez, P., Müller, E.: Focused clustering and outlier detection in large attributed graphs. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’14), pp. 1346–1355. ACM, New York (2014)

    Google Scholar 

  23. Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’14), pp. 701–710. ACM, New York (2014)

    Google Scholar 

  24. Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)

    Article  Google Scholar 

  25. Ruan, Y., Fuhry, D., Parthasarathy, S.: Efficient community detection in large networks using content and links. In: Proceedings of the 22nd International Conference on World Wide Web (WWW’13), pp. 1089–1098. ACM, New York (2013)

    Google Scholar 

  26. Schaeffer, S.E.: Survey: graph clustering. Comput. Sci. Rev. 1(1), 27–64 (2007)

    Article  Google Scholar 

  27. Steinhaeuser, K., Chawla, N.V.: Identifying and evaluating community structure in complex networks. Pattern Recogn. Lett. 31(5), 413–421 (2010)

    Article  Google Scholar 

  28. Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., Mei, Q.: Line: large-scale information network embedding. In: Proceedings of the 24th International Conference on World Wide Web (WWW’15), pp. 1067–1077. International World Wide Web Conferences Steering Committee, Geneva (2015)

    Google Scholar 

  29. Tenenbaum, J.B., de Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290(5500), 2319–2323 (2000)

    Article  Google Scholar 

  30. Xu, Z., Ke, Y., Wang, Y., Cheng, H., Cheng, J.: A model-based approach to attributed graph clustering. In: Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data (SIGMOD’12), pp. 505–516. ACM, New York (2012)

    Google Scholar 

  31. Xu, Z., Ke, Y., Wang, Y., Cheng, H., Cheng, J.: GBAGC: a general Bayesian framework for attributed graph clustering. ACM Trans. Knowl. Discov. Data 9(1), 5:1–5:43 (2014)

    Google Scholar 

  32. Yang, T., Jin, R., Chi, Y., Zhu, S.: Combining link and content for community detection: a discriminative approach. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’09), pp. 927–936. ACM, New York (2009)

    Google Scholar 

  33. Zanghi, H., Volant, S., Ambroise, C.: Clustering based on random graph model embedding vertex features. Pattern Recogn. Lett. 31(9), 830–836 (2010)

    Article  Google Scholar 

  34. Zhai, C., Velivelli, A., Yu, B.: A cross-collection mixture model for comparative text mining. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’04), pp. 743–748. ACM, New York (2004)

    Google Scholar 

  35. Zhao, X., Chang, A., Sarma, A.D., Zheng, H., Zhao, B.Y.: On the embeddability of random walk distances. Proc. VLDB Endow. 6(14), 1690–1701 (2013)

    Article  Google Scholar 

  36. Zhou, Y., Cheng, H., Yu, J.X.: Graph clustering based on structural/attribute similarities. Proc. VLDB Endow. 2(1), 718–729 (2009)

    Article  Google Scholar 

  37. Zhou, Y., Cheng, H., Yu, J.X.: Clustering large attributed graphs: an efficient incremental approach. In: Proceedings of the 2010 IEEE International Conference on Data Mining (ICDM’10), pp. 689–698. IEEE, Piscataway (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peixiang Zhao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Akbas, E., Zhao, P. (2019). Graph Clustering Based on Attribute-Aware Graph Embedding. In: Karampelas, P., Kawash, J., Özyer, T. (eds) From Security to Community Detection in Social Networking Platforms. ASONAM 2017. Lecture Notes in Social Networks. Springer, Cham. https://doi.org/10.1007/978-3-030-11286-8_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-11286-8_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-11285-1

  • Online ISBN: 978-3-030-11286-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics