Skip to main content

Graph-Based Tools for Data Mining and Machine Learning

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2734))

Abstract

Many powerful methods for intelligent data analysis have become available in the fields of machine learning and data mining. However, almost all of these methods are based on the assumption that the objects under consideration are represented in terms of feature vectors, or collections of attribute values. In the present paper we argue that symbolic representations, such as strings, trees or graphs, have a representational power that is significantly higher than the representational power of feature vectors. On the other hand, operations on these data structure that are typically needed in data mining and machine learning are more involved than their counterparts on feature vectors. However, recent progress in graph matching and related areas has led to many new practical methods that seem to be very promising for a wide range of applications.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Hand, D., Mannila, H., Smyth, P.: Principles of Data Mining. The MIT Press (2001)

    Google Scholar 

  2. Kantardzic, M.: Data Mining-Concepts, Models, Methods, and Algorithms. Wiley-Interscience (2003)

    Google Scholar 

  3. Mitchell, T.: Machine Learning. McGraw Hill (1997)

    Google Scholar 

  4. Fu, K.-S.: Syntactic Pattern Recognition and Applications. Prentice Hall (1982)

    Google Scholar 

  5. Bunke, H., Sanfeliu, A. (eds.): Syntactic and Structural Pattern Recognition-Theory and Applications. World Scientific (1990)

    Google Scholar 

  6. Wagner, R.A., Fischer, M.J.: The string-to-string correction problem. Journal of the Association for Computing Machinery 21(1) (1974) 168–173

    MATH  MathSciNet  Google Scholar 

  7. Ullman, J.R.: An algorithm for subgraph isomorphism. Journal of the ACM 23(1) (1976) 31–42

    Article  MATH  Google Scholar 

  8. Messmer, B.T., Bunke, H.: A new algorithm for error-tolerant subgraph isomorphism detection. IEEE Trans. PAMI 20(5) (1998) 493–507

    Google Scholar 

  9. McGregor, J.: Backtrack search algorithms and the maximal common subgraph problem. Software-Practice and Experience 12 (1982) 23–13

    Article  MATH  Google Scholar 

  10. Levi, G.: A note on the derivation of maximal common subgraphs of two directed or undirected graphs. Calcolo 9 (1972) 341–354

    Article  MathSciNet  Google Scholar 

  11. Bunke, H., Shearer, K.: A graph distance metric based on the maximal common subgraph. Pattern Recognition Letters 19 (1998) 255–259

    Article  MATH  Google Scholar 

  12. Wallis, W.D., Shoubridge, P., Kraetzl, M., Ray, D.: Graph distances using graph union Pattern Recognition Letters 22 (2001) 701–704

    Article  MATH  Google Scholar 

  13. Fernandez, M.L, Valiente, G.: A graph distance metric combining maximum common subgraph and minimum common supergraph. Pattern Recognition Letters 22 (2001) 753–758

    Article  MATH  Google Scholar 

  14. Bunke, H.: Error correcting graph matching: on the influence of the underlying cost function. IEEE Trans. PAMI 21 (1999) 917–922

    Google Scholar 

  15. Conte, D., Foggia, P., Sansone, C., Vento, M.: Thirty years of graph matching in pattern recognition. Int. Journal of Pattern Recognition and Art. Intelligence, Special issue on Graph Matching in Computer Vision and Pattern Recognition (to appear)

    Google Scholar 

  16. Jain, A., Murty, M., Flynn, P.: Data clustering: a review. ACM Computing Surveys 31 (1999) 264–323

    Article  Google Scholar 

  17. Seong, D., Kim, H., Park, K.: Incremental clustering of attributed graphs. IEEE Trans. SMC (1993) 1399–1411

    Google Scholar 

  18. Luo, B., Wilson, R.E., Hancock, E.: Spectral feature vectors for graph clustering. In: Caelli, T. et al. (eds.): Structural, Syntactic and Statistical Pattern Recognition. Lecture Notes in Computer Science, Vol. 2396. Springer (2002) 83–93

    Chapter  Google Scholar 

  19. Sanfeliu, A., Serratosa, F., Alquezar, R.: Synthesis of function-described graphs and clustering of attributed graph. Int. Journal of Pattern Recognition and Art. Intell. 16 (2002) 621–655

    Article  Google Scholar 

  20. Zahn, C.T.: Graph-theoretical methods for detecting and describing Gestalt structures. IEEE Trans. Computers C-20 (1971) 68–86

    Article  Google Scholar 

  21. Jiang, X., Münger, A., Bunke, H.: On median graphs: properties, algorithms, and applications. IEEE Trans. Pattern Analysis and Machine Intell. 23(10) (2001) 1144–1151

    Article  Google Scholar 

  22. Bunke, H., Günter, S.: Weighted mean of a pair of graphs. Computing 67 (2001) 209–224

    Article  MATH  MathSciNet  Google Scholar 

  23. Bunke, H., Kandel, A.: Mean and maximal common subgraph of two graphs. Pattern Recognition Letters 21 (2000) 163–168

    Article  Google Scholar 

  24. Günter, S., Bunke, H.: Self-organizing map for clustering in the graph domain. Pattern Recognition Letters 23 (2002) 401–417

    Article  Google Scholar 

  25. Günter, S., Bunke, H.: Validation indices for graph clustering. Pattern Recognition Letters 24 (2003) 1107–1113

    Article  MATH  Google Scholar 

  26. Hopcroft, J., Wong, J.: Linear time algorithm for isomorphism of planar graphs. Proc. 6th Annual Symposium on Theory of Computing (1974) 172–184

    Google Scholar 

  27. Jiang, X., Bunke, H.: Marked subgraph isomorphism of ordered graphs. In: Amin, A., Dori, D., Pudil, P., Freeman, H. (eds.): Advances in Pattern Recognition. Lecture Notes in Computer Science, Vol. 1451. Springer Verlag (1998) 122–131

    Chapter  Google Scholar 

  28. Dickinson, P., Bunke, H., Dadej, A., Kraetzl, M.: On graphs with unique node labels. Proc. 4th IAPR Workshop on Graph based Representations, York (2003)

    Google Scholar 

  29. Bunke, H., Kraetzl, M., Shoubridge, P., Wallis, W.: Detection of abnormal change in time series of graphs. Journal of Interconnection Networks 3 (2002) 85–101

    Article  Google Scholar 

  30. Bunke, H., Kraetzl, M.: Classification and detection of abnormal events in time series of graphs. In: Last, M., Kandel, A., Bunke, H. (eds.): Data Mining in Time Series Databases. World Scientific (2003)

    Google Scholar 

  31. Schenker, A., Last, M., Bunke, H., Kandel, A.: Clustering of documents using a graph model. In: Antonacopoulos, A., Hu, J. (eds.): Web Document Analysis: Challenges and Opportunities. World Scientific (2003)

    Google Scholar 

  32. Schenker, A., Last, M., Bunke, H., Kandel, A.: Classification of web documents using graph matching. Int. Journal of Pattern Recognition and Art. Intelligence (to appear)

    Google Scholar 

  33. Salton, G.: Automatic Text Processing: the Transformation, Analysis, and Retrieval of Information by Computer. Addison Wesley (1989)

    Google Scholar 

  34. Schenker, A., Last, M., Bunke, H., Kandel, A.: Graph representations for web document clustering. Proc. 1st Iberian Conf. on Pattern Recognition and Image Analysis (2003)

    Google Scholar 

  35. Ristad, E., Yianilos, P.: Learning string edit distance. IEEE Trans. PAMI 20 (1998) 522–532

    Google Scholar 

  36. Parizeau, M., Ghazzali, N., Hebert, J.-F.: Optimizing the cost matrix for approximate string matching using genetic algorithms. Pattern Recognition 31 (1998) 431–440

    Article  Google Scholar 

  37. Gomez-Ballester, M., Forcada, M., Mico, M.: A gradient-ascent method to adapt the edit distance to a classification task. In: Torres, M.I., Sanfeliu, A. (eds.): Pattern Recognition and Applications. IOS Press (2000) 13–18

    Google Scholar 

  38. Ambauen, R., Fischer, S., Bunke, H.: Graph edit distance with node splitting and merging and its application to diatom identification. Proc. 4th IAPR Workshop on Graph based Representations, York (2003)

    Google Scholar 

  39. Neuhaus, M., Bunke, H.: Self-organizing graph edit distance. Proc. 4th IAPR Workshop on Graph based Representations, York (2003)

    Google Scholar 

  40. Neuhaus, M.: Learning Graph Edit Distance. Diploma Thesis, University of Bern (2003)

    Google Scholar 

  41. Kohonen, T.: Self-Organizing Maps. Springer Verlag (1995)

    Google Scholar 

  42. Irniger, Ch., Bunke, H.: Graph Matching: Filtering large databases of graphs using decision trees. In: Jolion, J.-M., Kropatsch, W., Vento, M. (eds.): Proc. 3rd IAPR-TC15 Workshop on Graph-based Representations in Pattern Recognition. (2001) 239–249

    Google Scholar 

  43. Irniger, Ch., Bunke, H.: Theoretical analysis and experimental comparison of graph matching algorithms for database filtering. Proc. 4th IAPR Workshop on Graph based Representations, York (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bunke, H. (2003). Graph-Based Tools for Data Mining and Machine Learning. In: Perner, P., Rosenfeld, A. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2003. Lecture Notes in Computer Science, vol 2734. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45065-3_2

Download citation

  • DOI: https://doi.org/10.1007/3-540-45065-3_2

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40504-7

  • Online ISBN: 978-3-540-45065-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics