Clustering Attributed Multi-graphs with Information Ranking

Papadopoulos, Andreas; Rafailidis, Dimitrios; Pallis, George; Dikaiakos, Marios D.

doi:10.1007/978-3-319-22849-5_29

Clustering Attributed Multi-graphs with Information Ranking

Andreas Papadopoulos¹⁸,
Dimitrios Rafailidis¹⁸,
George Pallis¹⁸ &
…
Marios D. Dikaiakos¹⁸

Conference paper
First Online: 01 January 2015

1294 Accesses
8 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9261))

Abstract

Attributed multi-graphs are data structures to model real-world networks of objects which have rich properties/attributes and they are connected by multiple types of edges. Clustering attributed multi-graphs has several real-world applications, such as recommendation systems and targeted advertisement. In this paper, we propose an efficient method for Clustering Attributed Multi-graphs with Information Ranking, namely CAMIR. We introduce an iterative algorithm that ranks the different vertex attributes and edge-types according to how well they can separate vertices into clusters. The key idea is to consider the ‘agreement’ among the attribute- and edge-types, assuming that two vertex properties ‘agree’ if they produced the same clustering result when used individually. Furthermore, according to the calculated ranks we construct a unified similarity measure, by down-weighting noisy vertex attributes or edge-types that may reduce the clustering accuracy. Finally, to generate the final clusters, we follow a spectral clustering approach, suitable for graph partitioning and detecting arbitrary shaped clusters. In our experiments with synthetic and real-world datasets, we show the superiority of CAMIR over several state-of-the-art clustering methods.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
Alternatively, several parallel spectral clustering methods could be used in the proposed approach, such as the works of [19, 20], to reduce the computational time of spectral clustering.
2.
Also, other types of kernel functions could be used, such as linear and polynomial, thoroughly examined in [21] for machine learning methods.
3.
Following [7] we use a common \(\lambda \) for all properties. In practice though, \(\lambda = 0.001\) is an appropriate value to control the impact of the other properties, as we observed in our experiments.
4.
The full DBLP dataset is available online at http://kdl.cs.umass.edu/data/dblp/dblp-info.html.
5.
Available online at http://code.google.com.

References

Cheng, H., Zhou, Y., Yu, J.X.: Clustering large attributed graphs: a balance between structural and attribute similarities. ACM Trans. Knowl. Discov. Data 5(2), 12:1–12:33 (2011)
Article MATH Google Scholar
Papadopoulos, A., Pallis, G., Dikaiakos, M.D.: Identifying clusters with attribute homogeneity and similar connectivity in information networks. In: Proceedings of the 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT), WI-IAT 2013, vol. 01, pp. 343–350. IEEE Computer Society, Washington, DC (2013)
Google Scholar
Akoglu, L., Tong, H., Meeder, B., Faloutsos, C.: Pics: parameter-free identification of cohesive subgroups in large attributed graphs. In: Proceedings of the 12th SIAM International Conference on Data Mining, SDM 2012, pp. 439–450. SIAM/Omnipress (2012)
Google Scholar
Perozzi, B., Akoglu, L., Iglesias Sánchez, P., Müller, E.: Focused clustering and outlier detection in large attributed graphs. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2014, pp. 1346–1355. ACM, New York (2014)
Google Scholar
Xu, Z., Ke, Y., Wang, Y., Cheng, H., Cheng, J.: GBAGC: a general bayesian framework for attributed graph clustering. ACM Trans. Knowl. Discov. Data 9(1), 5:1–5:43 (2014)
Article Google Scholar
Zhou, Y., Cheng, H., Yu, J.X.: Graph clustering based on structural/attribute similarities. Proc. VLDB Endow. 2(1), 718–729 (2009)
Article Google Scholar
Kumar, A., Rai, P., Daume, H.: Co-regularized multi-view spectral clustering. In: Shawe-Taylor, J., Zemel, R., Bartlett, P., Pereira, F., Weinberger, K. (eds.) Advances in Neural Information Processing Systems 24, pp. 1413–1421. Curran Associates, Inc., NY (2011)
Google Scholar
Karypis, G., Kumar, V.: Multilevel algorithms for multi-constraint graph partitioning. In: Proceedings of the 1998 ACM/IEEE Conference on Supercomputing, SC 1998, pp. 1–13. IEEE Computer Society, Washington, DC (1998)
Google Scholar
Papalexakis, E., Akoglu, L., Ience, D.: Do more views of a graph help? community detection and clustering in multi-graphs. In: 2013 16th International Conference on Information Fusion (FUSION), pp. 899–905 (2013)
Google Scholar
Xu, X., Yuruk, N., Feng, Z., Schweiger, T.A.J.: SCAN: a structural clustering algorithm for networks. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2007, pp. 824–833. ACM, New York (2007)
Google Scholar
Yang, J., McAuley, J.J., Leskovec, J.: Community detection in networks with node attributes. [24], pp. 1151–1156
Google Scholar
Xu, Z., Ke, Y., Wang, Y., Cheng, H., Cheng, J.: A model-based approach to attributed graph clustering. In: Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, SIGMOD 2012, pp. 505–516. ACM, New York (2012)
Google Scholar
Günnemann, S., Färber, I., Raubach, S., Seidl, T.: Spectral subspace clustering for graphs with feature vectors. [24], pp. 231–240
Google Scholar
Günnemann, S., Boden, B., Färber, I., Seidl, T.: Efficient mining of combined subspace and subgraph clusters in graphs with feature vectors. In: Pei, J., Tseng, V.S., Cao, L., Motoda, H., Xu, G. (eds.) PAKDD 2013, Part I. LNCS, vol. 7818, pp. 261–275. Springer, Heidelberg (2013)
Chapter Google Scholar
Zhou, Y., Cheng, H., Yu, J.X.: Clustering large attributed graphs: an efficient incremental approach. In: Webb, G.I., Liu, B., Zhang, C., Gunopulos, D., Wu, X. (eds.) ICDM 2010, pp. 689–698. IEEE Computer Society, Washington, DC (2010)
Google Scholar
Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17(4), 395–416 (2007)
Article MathSciNet Google Scholar
Chan, P.K., Schlag, M.D.F., Zien, J.Y.: Spectral K-way ratio-cut partitioning and clustering. IEEE Trans. CAD Integr. Circuits Syst. 13(9), 1088–1096 (1994)
Article Google Scholar
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)
Article Google Scholar
Chen, W.Y., Song, Y., Bai, H., Lin, C.J., Chang, E.: Parallel spectral clustering in distributed systems. IEEE Trans. Pattern Anal. Mach. Intell. 33(3), 568–586 (2011)
Article Google Scholar
Kang, U., Meeder, B., Papalexakis, E.E., Faloutsos, C.: Heigen: spectral analysis for billion-scale graphs. IEEE Trans. Knowl. Data Eng. 26(2), 350–362 (2014)
Article MATH Google Scholar
Hofmann, T., Schölkopf, B., Smola, A.J.: Kernel methods in machine learning. Ann. Stat. 36(3), 1171–1220 (2008)
Article MATH Google Scholar
Ning, H., Xu, W., Chi, Y., Gong, Y., Huang, T.S.: Incremental spectral clustering by efficiently updating the eigen-system. Pattern Recogn. 43(1), 113–127 (2010)
Article Google Scholar
Mall, R., Langone, R., Suykens, J.A.K.: Kernel spectral clustering for big data networks. Entropy 15(5), 1567–1586 (2013)
Article MathSciNet Google Scholar
Xiong, H., Karypis, G., Thuraisingham, B.M., Cook, D.J., Wu, X. (eds.): 2013 IEEE 13th International Conference on Data Mining. IEEE Computer Society, Washington, DC (2013)
Google Scholar

Download references

Acknowledgments

This work was partially supported by the EU Commission in terms of the PaaSport 605193 FP7 project (FP7-SME-2013).

Author information

Authors and Affiliations

Department of Computer Science, University of Cyprus, Nicosia, Cyprus
Andreas Papadopoulos, Dimitrios Rafailidis, George Pallis & Marios D. Dikaiakos

Authors

Andreas Papadopoulos
View author publications
You can also search for this author in PubMed Google Scholar
Dimitrios Rafailidis
View author publications
You can also search for this author in PubMed Google Scholar
George Pallis
View author publications
You can also search for this author in PubMed Google Scholar
Marios D. Dikaiakos
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andreas Papadopoulos .

Editor information

Editors and Affiliations

Hewlett-Packard Enterprise, Sunnyvale, California, USA
Qiming Chen
Paul Sabatier University, Toulouse, France
Abdelkader Hameurlain
Blaise Pascal University, Aubiere, France
Farouk Toumani
University of Linz, Linz, Austria
Roland Wagner
Universidad Politécnica de Valencia, Valencia, Spain
Hendrik Decker

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Papadopoulos, A., Rafailidis, D., Pallis, G., Dikaiakos, M.D. (2015). Clustering Attributed Multi-graphs with Information Ranking. In: Chen, Q., Hameurlain, A., Toumani, F., Wagner, R., Decker, H. (eds) Database and Expert Systems Applications. Globe DEXA 2015 2015. Lecture Notes in Computer Science(), vol 9261. Springer, Cham. https://doi.org/10.1007/978-3-319-22849-5_29

Download citation

DOI: https://doi.org/10.1007/978-3-319-22849-5_29
Published: 11 August 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-22848-8
Online ISBN: 978-3-319-22849-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics