Advertisement

Detecting Smooth Cluster Changes in Evolving Graph Structures

  • Sohei Okui
  • Kaho Osamura
  • Akihiro InokuchiEmail author
Chapter
Part of the Studies in Big Data book series (SBD, volume 41)

Abstract

Graph mining is a set of techniques for finding useful patterns in various types of structured data. Many effective algorithms for mining static graphs have been proposed. However, graphs of human relationships and evolving genes change over time, and such evolving graphs require different algorithms for analysis. In this chapter, we explain a method called O2I for clustering in evolving graphs that can detect changes in clusters over time. O2I partitions the graph sequence into smooth clusters, even when the numbers of clusters and vertices vary. It first constructs a graph from the graph sequence, then uses spectral clustering and the RatioCut to apply k partitioning to this graph. O2I is compared in detail with the preserving clustering membership (PCM) algorithm, which is a conventional online graph-sequence clustering algorithm in which the numbers of clusters and vertices must remain constant. We further show that, in contrast to PCM, the performance of O2I is not dependent on the clustering of the initial graph in the graph sequence. Experiments on synthetic evolving graphs show that O2I is practical to calculate and addresses the main disadvantages of PCM. Further tests on real-world data show that O2I can obtain reasonable clusters. This method is hence a flexible clustering solution and will be useful on a wide range of graph-mining applications in which the connections, number of clusters, and number of vertices of the graphs evolve over time.

References

  1. 1.
    Aggarwal, C.C., Han, J., Wang, J., Philip S.Y.: A framework for clustering evolving data streams. In: Proceedings of International Conference on Very Large Data Bases (VLDB), pp. 81–92 (2003)Google Scholar
  2. 2.
    Aggarwal, C.C., Han, J., Wang, J., Philip S.Y.: A framework for projected clustering of high dimensional data streams. In: Proc. of International Conference on Very Large Data Bases (VLDB), pp. 852–863 (2004)Google Scholar
  3. 3.
    Aggarwal, C.C., Han, J., Wang, J., Philip S.Y.: On demand classification of data streams. In: Proceedings of International Conference on Knowledge Discovery and Data Mining (KDD), pp. 503–508 (2004)Google Scholar
  4. 4.
    Bar-Joseph, Z., Gerber, G.K., Gifford, D.K., Jaakkola, T.S., Simon, I.: A new approach to analyzing gene expression time series data. In: Proceedings of International Conference on Computational Biology (RECOMB), pp. 39–48 (2002)Google Scholar
  5. 5.
    Beringer, J., Hüllermeier, E.: Online clustering of parallel data streams. Data Knowl. Eng. 58(2), 180–204 (2006)CrossRefGoogle Scholar
  6. 6.
    Berlingerio, M., Bonchi, F., Bringmann, B., Gionis, A.: Mining graph evolution rules. In: Proceedings of European Conference on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD), pp. 115–130 (2009)Google Scholar
  7. 7.
    Cao, F., Ester, M., Qian, W., Zhou, A.: Density-based clustering over an evolving data stream with noise. In: Proceedings of SIAM International Conference on Data Mining (SDM), pp. 328–339 (2006)Google Scholar
  8. 8.
    Chakrabarti, D., Kumar, R., Tomkins, A.: Evolutionary clustering. In: Proceedings of International Conference on Knowledge Discovery and Data Mining (KDD), pp. 554–560 (2006)Google Scholar
  9. 9.
    Charikar, M., O’Callaghan, L., Panigrahy, R.: Better streaming algorithms for clustering problems. In: Proceedings of Annual ACM Symposium on Theory of Computing (STOC), pp. 30–39 (2003)Google Scholar
  10. 10.
    Chi, Y., Song, X., Zhou, D., Hino, K., Tseng, B.L.: On evolutionary spectral clustering. ACM Trans. Knowl. Discov. Data 3(4), 17:1–17:30 (2009)Google Scholar
  11. 11.
    Domingos, P.M., Hulten, G.: A general method for scaling up machine learning algorithms and its application to clustering. In: Proceedings of International Conference on Machine Learning (ICML), pp. 106–113 (2001)Google Scholar
  12. 12.
    Inokuchi, A., Washio, I.: Mining frequent graph sequence patterns induced by vertices. In: Proceedings of SIAM International Conference on Data Mining (SDM), pp. 466–477 (2010)Google Scholar
  13. 13.
    Klimmt, B., Yang, Y.: Introducing the Enron corpus. In: CEAS Conference (2004)Google Scholar
  14. 14.
    Möller-Levet, C.S., Klawonn, F., Cho, K.-H., Yin, H., Wolkenhauer, O.: Clustering of unevenly sampled gene expression time-series data. Fuzzy Sets Syst. 152(1), 49–66 (2005)MathSciNetCrossRefGoogle Scholar
  15. 15.
    O’Callaghan, L., Meyerson, A., Motwani, R., Mishra, N., Guha, S.: Streaming-data algorithms for high-quality clustering. In: Proceedings of International Conference on Data Engineering (ICDE), pp. 685–694 (2002)Google Scholar
  16. 16.
    Okui, S., Osamura, K., Inokuchi, A.: Detecting smooth cluster changes in evolving graphs. In: Proceedings of International Conference on Machine Learning and Applications (ICMLA), pp. 369–374 (2016)Google Scholar
  17. 17.
    van Wijk, J.J., van Selow, E.R.: Cluster and calendar based visualization of time series data. In: Proceedings of IEEE Symposium on Information Visualization (INFOVIS), pp. 4–9 (1999)Google Scholar
  18. 18.
    von Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17(4), 395–416 (2007)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Wang, Y., Liu, S.-X., Feng, J., Zhou, L.: Mining naturally smooth evolution of clusters from dynamic data. In: Proceedings of SIAM International Conference on Data Mining (SDM), pp. 125–134 (2007)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Graduate School of Science and TechnologyKwansei Gakuin UniversitySandaJapan
  2. 2.School of Science and TechnologyKwansei Gakuin UniversitySandaJapan

Personalised recommendations