Clustering Improves the Exploration of Graph Mining Results

  • Edgar H. de Graaf
  • Joost N. Kok
  • Walter A. Kosters
Part of the IFIP The International Federation for Information Processing book series (IFIPAICT, volume 247)


Mining frequent subgraphs is an area of research where we have a given set of graphs, and where we search for (connected) subgraphs contained in many of these graphs. Each graph can be seen as a transaction, or as a molecule — as the techniques applied in this paper are used in (bio)chemical analysis.

In this work we will discuss an application that enables the user to further explore the results from a frequent subgraph mining algorithm. Such an algorithm gives the frequent subgraphs, also referred to as fragments, in the graphs in the dataset. Next to frequent subgraphs the algorithm also provides a lattice that models sub- and supergraph relations among the fragments, which can be explored with our application. The lattice can also be used to group fragments by means of clustering algorithms, and the user can easily browse from group to group. The application can also display only a selection of groups that occur in almost the same set of molecules, or on the contrary in different molecules. This allows one to see which patterns cover different or similar parts of the dataset.


Distance Matrix Pattern Mining Lattice Information Position Weight Matrix Graph Mining 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Graaf, E.H. de, Kok, J.N. and Kosters, W.A.: Visualization and Grouping of Graph Patterns in Molecular Databases, Submitted.Google Scholar
  2. 2.
    Hanke, J., Beckmann, G., Bork, P. and Reich, J.G.: Self-Organizing Hierarchic Networks for Pattern Recognition in Protein Sequences, Protein Science Journal 5 (1996), pp. 72–82.CrossRefGoogle Scholar
  3. 3.
    Mahony, S., Hendrix, D., Smith, T.J. and Golden, A.: Self-Organizing Maps of Position Weight Matrices for Motif Discovery in Biological Sequences, Artificial Intelligence Review Journal 24 (2005), pp. 397–413.CrossRefGoogle Scholar
  4. 4.
    Samsonova, E.V., Bäck, T., Kok, J.N. and IJzerman, A.P.: Reliable Hierarchical Clustering with the Self-Organizing Map, in Proc. 6th International Symposium on Intelligent Data Analysis (IDA 2005), LNCS 2810, pp. 385–396.Google Scholar
  5. 5.
    Uchiyama, I.: Hierarchical Clustering Algorithm for Comprehensive Orthologous-Domain Classification in Multiple Genomes, Nucleic Acids Research Vol. 34, No. 2 (2006), pp. 647–658.CrossRefGoogle Scholar
  6. 6.
    Xu, J., Zhang, Q. and Shih, G.K.: V-Cluster Algorithm: A New Algorithm for Clustering Molecules Based Upon Numeric Data, Molecular Diversity 10 (2006), pp. 463–478.CrossRefGoogle Scholar
  7. 7.
    Yan, X. and Han, J.: gSpan: Graph-Based Substructure Pattern Mining. In Proc. 2002 IEEE International Conference on Data Mining (ICDM 2002), pp. 721–724.Google Scholar
  8. 8.
    Zaki, M., Parthasarathy, S., Ogihara, M. and Li, W.: New Algorithms for Fast Discovery of Association Rules, in Proc. 3rd International Conference on Knowledge Discovery and Data Mining (KDD 1997), pp. 283–296.Google Scholar

Copyright information

© International Federation for Information Processing 2007

Authors and Affiliations

  • Edgar H. de Graaf
    • 1
  • Joost N. Kok
    • 1
  • Walter A. Kosters
    • 1
  1. 1.Leiden Institute of Advanced Computer ScienceLeiden UniversityThe Netherlands

Personalised recommendations