Advertisement

On the Usefulness of Weight-Based Constraints in Frequent Subgraph Mining

  • Frank Eichinger
  • Matthias Huber
  • Klemens Böhm
Conference paper

Abstract

Frequent subgraph mining is an important data-mining technique. In this paper we look at weighted graphs, which are ubiquitous in the real world. The analysis of weights in combination with mining for substructures might yield more precise results. In particular, we study frequent subgraph mining in the presence of weight-based constraints and explain how to integrate them into mining algorithms. While such constraints only yield approximate mining results in most cases, we demonstrate that such results are useful nevertheless and explain this effect. To do so, we both assess the completeness of the approximate result sets, and we carry out application-oriented studies with real-world data-analysis problems: software-defect localization and explorative mining in transportation logistics. Our results are that the runtime can improve by a factor of up to 3.5 in defect localization and 7 in explorative mining. At the same time, we obtain an even slightly increased defect-localization precision and obtain good explorative mining results.

Keywords

Edge Weight Defect Localization Mining Algorithm Weighted Graph Graph Pattern 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Eichinger, F., Böhm, K., Huber, M.: Mining Edge-Weighted Call Graphs to Localise Software Bugs. In: ECML PKDD (2008)Google Scholar
  2. 2.
    Eichinger, F., Huber, M., Böhm, K.: On the Usefulness of Weight-Based Constraints in Frequent Subgraph Mining. Tech. Rep. 2010-10, Faculty of Informatics, Karlsruhe Institute of Technology. digbib.ubka.uni-karlsruhe.de/volltexte/1000017769Google Scholar
  3. 3.
    Hasan, M.A., Chaoji, V., Salem, S., Besson, J., Zaki, M.J.: ORIGAMI: Mining Representative Orthogonal Graph Patterns. In: ICDM (2007)Google Scholar
  4. 4.
    Inokuchi, A., Washio, T., Motoda, H.: Complete Mining of Frequent Patterns from Graphs: Mining Graph Data. Mach. Learn. 50(3), 321–354 (2003)MATHCrossRefGoogle Scholar
  5. 5.
    Jiang, C., Coenen, F., Sanderson, R., Zito, M.: Text Classification using Graph Mining-based Feature Extraction. Knowl.-Based Syst. 23(4), 302–308 (2010)CrossRefGoogle Scholar
  6. 6.
    Jiang,W., Vaidya, J., Balaporia, Z., Clifton, C., Banich, B.: Knowledge Discovery from Transportation Network Data. In: ICDE (2005)Google Scholar
  7. 7.
    Kudo, T., Maeda, E., Matsumoto, Y.: An Application of Boosting to Graph Classification. In: NIPS (2004)Google Scholar
  8. 8.
    Ng, R.T., Lakshmanan, L.V.S., Han, J., Pang, A.: Exploratory Mining and Pruning Optimizations of Constrained Associations Rules. In: SIGMOD (1998)Google Scholar
  9. 9.
    Nowozin, S., Tsuda, K., Uno, T., Kudo, T., Bakir, G.: Weighted Substructure Mining for Image Analysis. In: Conf. on Computer Vision and Pattern Recognition (CVPR) (2007)Google Scholar
  10. 10.
    Pei, J., Han, J., Lakshmanan, L.V.S.: Pushing Convertible Constraints in Frequent Itemset Mining. Data Min. Knowl. Discov. 8(3), 227–252 (2004)CrossRefMathSciNetGoogle Scholar
  11. 11.
    Philippsen, M., et al.: ParSeMiS: The Parallel and Sequential Mining Suite. Available at www2.informatik.uni-erlangen.de/EN/research/ParSeMiS/Google Scholar
  12. 12.
    Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann (1993)Google Scholar
  13. 13.
    Saigo, H., Krämer, N., Tsuda, K.: Partial Least Squares Regression for Graph Mining. In: KDD (2008)Google Scholar
  14. 14.
    Wang, C., Zhu, Y., Wu, T., Wang, W., Shi, B.: Constraint-Based Graph Mining in Large Database. In: Asia-Pacific Web Conf. (APWeb) (2005)Google Scholar
  15. 15.
    Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann (2005)Google Scholar
  16. 16.
    Yan, X., Cheng, H., Han, J., Yu, P.S.: Mining Significant Graph Patterns by Leap Search. In: SIGMOD (2008)Google Scholar
  17. 17.
    Yan, X., Han, J.: gSpan: Graph-Based Substructure Pattern Mining. In: ICDM (2002)Google Scholar
  18. 18.
    Yan, X., Han, J.: CloseGraph: Mining Closed Frequent Graph Patterns. In: KDD (2003)Google Scholar
  19. 19.
    Yan, X., Han, J.: Discovery of Frequent Substructures. In: D.J. Cook, L.B. Holder (eds.) Mining Graph Data, chap. 5, pp. 99–115. Wiley (2006)Google Scholar
  20. 20.
    Zhu, F., Yan, X., Han, J., Yu, P.S.: gPrune: A Constraint Pushing Framework for Graph Pattern Mining. In: PAKDD (2007)Google Scholar

Copyright information

© Springer-Verlag London Limited 2011

Authors and Affiliations

  1. 1.Karlsruhe Institute of Technology (KIT)KarlsruheGermany

Personalised recommendations