Skip to main content

On the Usefulness of Weight-Based Constraints in Frequent Subgraph Mining

  • Conference paper
  • First Online:
Book cover Research and Development in Intelligent Systems XXVII (SGAI 2010)

Abstract

Frequent subgraph mining is an important data-mining technique. In this paper we look at weighted graphs, which are ubiquitous in the real world. The analysis of weights in combination with mining for substructures might yield more precise results. In particular, we study frequent subgraph mining in the presence of weight-based constraints and explain how to integrate them into mining algorithms. While such constraints only yield approximate mining results in most cases, we demonstrate that such results are useful nevertheless and explain this effect. To do so, we both assess the completeness of the approximate result sets, and we carry out application-oriented studies with real-world data-analysis problems: software-defect localization and explorative mining in transportation logistics. Our results are that the runtime can improve by a factor of up to 3.5 in defect localization and 7 in explorative mining. At the same time, we obtain an even slightly increased defect-localization precision and obtain good explorative mining results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Eichinger, F., Böhm, K., Huber, M.: Mining Edge-Weighted Call Graphs to Localise Software Bugs. In: ECML PKDD (2008)

    Google Scholar 

  2. Eichinger, F., Huber, M., Böhm, K.: On the Usefulness of Weight-Based Constraints in Frequent Subgraph Mining. Tech. Rep. 2010-10, Faculty of Informatics, Karlsruhe Institute of Technology. digbib.ubka.uni-karlsruhe.de/volltexte/1000017769

    Google Scholar 

  3. Hasan, M.A., Chaoji, V., Salem, S., Besson, J., Zaki, M.J.: ORIGAMI: Mining Representative Orthogonal Graph Patterns. In: ICDM (2007)

    Google Scholar 

  4. Inokuchi, A., Washio, T., Motoda, H.: Complete Mining of Frequent Patterns from Graphs: Mining Graph Data. Mach. Learn. 50(3), 321–354 (2003)

    Article  MATH  Google Scholar 

  5. Jiang, C., Coenen, F., Sanderson, R., Zito, M.: Text Classification using Graph Mining-based Feature Extraction. Knowl.-Based Syst. 23(4), 302–308 (2010)

    Article  Google Scholar 

  6. Jiang,W., Vaidya, J., Balaporia, Z., Clifton, C., Banich, B.: Knowledge Discovery from Transportation Network Data. In: ICDE (2005)

    Google Scholar 

  7. Kudo, T., Maeda, E., Matsumoto, Y.: An Application of Boosting to Graph Classification. In: NIPS (2004)

    Google Scholar 

  8. Ng, R.T., Lakshmanan, L.V.S., Han, J., Pang, A.: Exploratory Mining and Pruning Optimizations of Constrained Associations Rules. In: SIGMOD (1998)

    Google Scholar 

  9. Nowozin, S., Tsuda, K., Uno, T., Kudo, T., Bakir, G.: Weighted Substructure Mining for Image Analysis. In: Conf. on Computer Vision and Pattern Recognition (CVPR) (2007)

    Google Scholar 

  10. Pei, J., Han, J., Lakshmanan, L.V.S.: Pushing Convertible Constraints in Frequent Itemset Mining. Data Min. Knowl. Discov. 8(3), 227–252 (2004)

    Article  MathSciNet  Google Scholar 

  11. Philippsen, M., et al.: ParSeMiS: The Parallel and Sequential Mining Suite. Available at www2.informatik.uni-erlangen.de/EN/research/ParSeMiS/

    Google Scholar 

  12. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann (1993)

    Google Scholar 

  13. Saigo, H., Krämer, N., Tsuda, K.: Partial Least Squares Regression for Graph Mining. In: KDD (2008)

    Google Scholar 

  14. Wang, C., Zhu, Y., Wu, T., Wang, W., Shi, B.: Constraint-Based Graph Mining in Large Database. In: Asia-Pacific Web Conf. (APWeb) (2005)

    Google Scholar 

  15. Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann (2005)

    Google Scholar 

  16. Yan, X., Cheng, H., Han, J., Yu, P.S.: Mining Significant Graph Patterns by Leap Search. In: SIGMOD (2008)

    Google Scholar 

  17. Yan, X., Han, J.: gSpan: Graph-Based Substructure Pattern Mining. In: ICDM (2002)

    Google Scholar 

  18. Yan, X., Han, J.: CloseGraph: Mining Closed Frequent Graph Patterns. In: KDD (2003)

    Google Scholar 

  19. Yan, X., Han, J.: Discovery of Frequent Substructures. In: D.J. Cook, L.B. Holder (eds.) Mining Graph Data, chap. 5, pp. 99–115. Wiley (2006)

    Google Scholar 

  20. Zhu, F., Yan, X., Han, J., Yu, P.S.: gPrune: A Constraint Pushing Framework for Graph Pattern Mining. In: PAKDD (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Frank Eichinger , Matthias Huber or Klemens Böhm .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag London Limited

About this paper

Cite this paper

Eichinger, F., Huber, M., Böhm, K. (2011). On the Usefulness of Weight-Based Constraints in Frequent Subgraph Mining. In: Bramer, M., Petridis, M., Hopgood, A. (eds) Research and Development in Intelligent Systems XXVII. SGAI 2010. Springer, London. https://doi.org/10.1007/978-0-85729-130-1_5

Download citation

  • DOI: https://doi.org/10.1007/978-0-85729-130-1_5

  • Published:

  • Publisher Name: Springer, London

  • Print ISBN: 978-0-85729-129-5

  • Online ISBN: 978-0-85729-130-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics