The Effect of Background Knowledge in Graph-Based Learning in the Chemoinformatics Domain
Typical machine learning systems often use a set of previous experiences (examples) to learn concepts, patterns, or relations hidden within the data . Current machine learning approaches are challenged by the growing size of the data repositories and the growing complexity of those data [1, 2]. In order to accommodate the requirement of being able to learn from complex data, several methods have been introduced in the field of machine learning . Based on the way the input and resulting hypotheses are represented, two main categories of such methods exist, namely, logic-based and graph-based methods . The demarcation line between logic- and graph-based methods lies in the differences of their data representation methods, hypothesis formation, and testing as well as the form of the output produced.
The main purpose of our study is to investigate the effect of incorporating background knowledge into graph learning methods. The ability of graph learning methods to obtain accurate theories with a minimum of background knowledge is of course a desirable property, but not being able to effectively utilize additional knowledge that is available and has been proven important is clearly a disadvantage. Therefore we examine how far additional, already available, background knowledge can be effectively used for increasing the performance of a graph learner. Another contribution of our study is that it establishes a neutral ground to compare classifi- cation accuracies of the two closely related approaches, making it possible to study whether graph learning methods actually would outperform ILP methods if the same background knowledge were utilized .
The rest of this chapter is organized as follows. The next section discusses related work concerning the contribution of background knowledge when learning from complex data. Section 10.3 provides a description of the graph learning method that is used in our study. The experimental setup, empirical evaluation, and the results from the study are described in Sect. 10.4. Finally, Sect. 10.5 provides conclusions from the experiments and points out interesting extensions of the work reported in this study.
KeywordsBackground Knowledge Inductive Logic Programming Edge Label Node Label Canonical Label
Unable to display preview. Download preview PDF.
- 1.Mitchell, T.M. (2006), The Discipline of Machine Learning, CMU-ML-06-108, School of Computer Science, Carnegie Mellon University, Pittsburgh.Google Scholar
- 3.Ketkar, N., Holder, L., and Cook, D. (2005), Comparison of graph-based and logic-based MRDM, ACM SIGKDD Explorations, 7(2) (Special Issue on Link Mining).Google Scholar
- 4.Muggleton, S. and De Raedt L. (1994), Inductive logic programming: Theory and methods. Journal of Logic Programming.Google Scholar
- 5.Agrawal, R. and Srikant, R. (1994), Fast algorithms for mining association rules, VLDB, Chile, pp. 487–99.Google Scholar
- 8.Srinivasan, A., King, R.D., and Muggleton, S. (1999), The role of background knowledge: Using a problem from chemistry to examine the performance of an ILP program, TR PRG-TR-08-99, Oxford.Google Scholar
- 9.Gonzalez, J., Holder, L.B., and Cook, D.J. (2001), Application of graph-based concept learning to the predictive toxicology domain, in Proceedings of the Predictive Toxicology Challenge Workshop.Google Scholar
- 10.Srinivasan, A., Muggleton, S.H., Sternberg, M.J.E., and King, R.D. (1995), The effect of background knowledge in inductive logic programming: A case study, PRG-TR-9-95, Oxford University Computing Laboratory.Google Scholar
- 11.Lodhi, H. and Muggleton, S.H. (2005), Is mutagenesis still challenging?, in Proceedings of the 15th International Conference on Inductive Logic Programming, ILP 2005, Late-Breaking Papers, pp. 35–40.Google Scholar
- 12.Lavrac, N., Zelezny, F., and Flach, P., (2002), RSD: Relational subgroup discovery through first-order feature construction, in Proceedings of the 12th International Conference on Inductive Logic Programming (ILP’02), Springer-Verlag, New York.Google Scholar
- 13.Flach, P., and Lachiche, N. (1999), 1BC: A first-order Bayesian classifier, in S. Daezeroski and P. Flach (Eds.), Proceedings of the 9th International Workshop on Inductive Logic Programming, pp. 92–103. Springer-Verlag, New York.Google Scholar
- 15.Quinlan, J.R. and Cameron-Jones, R.M. (1993), FOIL, in Proceedings of the 6th European Conference on Machine Learning, Lecture Notes in Artificial Intelligence, Vol. 667, pp. 3–20. Springer-Verlag, New York.Google Scholar
- 16.Blockeel, H. and De Raedt, L. Top-down induction of first-order logical decision trees, Artificial Intelligence (101)1–2:285–297.Google Scholar
- 17.Zaki, M.J. and Aggarwal, C.C. (2003), XRules: An Effective Structural Classifier for XML Data KDD, Washington, DC, ACM 316–325.Google Scholar
- 18.Cook, J. and Holder, L. (1994), Graph-based relational learning: Current and future directions, JAIR, 1:231–255.Google Scholar
- 19.Fischer, I. and Meinl, T. (2004), Graph based molecular data mining—An overview, in IEEE SMC 2004 Conference Proceedings, pp. 4578–4582.Google Scholar
- 20.Ketkar, N., Holder, L., and Cook, D. (2005), Qualitative comparison of graph-based and logic-based multi-relational data mining: A case study, in Proceedings of the ACM KDD Workshop on Multi-Relational Data Mining, August 2005.Google Scholar
- 21.Xifeng, Y. and Jiawei, H. (2002), “gSpan: Graph-based substructure pattern mining,” in Second IEEE International Conference on Data Mining (ICDM’02), p. 721.Google Scholar
- 22.Borgwardt, K.M. and Kriegel, H.P. (2005), Shortest-path kernels on graphs, ICDM, pp. 74–81.Google Scholar
- 23.Ramon, J. and Gaertner, T. (2003), Expressivity versus efficiency of graph kernels, in Proceedings of the First International Workshop on Mining Graphs, Trees and Sequences, pp. 65–74.Google Scholar
- 24.Karunaratne, T. and Boström, H. (2006), Learning from structured data by finger printing, in Proceedings of 9th Scandinavian Conference of Artificial Intelligence, Helsinki, Finland (to appear).Google Scholar
- 26.Srinivasan, A., King, R.D., Muggleton, S.H., and Sternberg, M.J.E. (1997), Carcinogenesis predictions using ILP, in Proceedings of the 7th International Workshop on Inductive Logic Programming.Google Scholar
- 27.US National Toxicology program, http://ntp.niehs.nih.gov/index.cfm?objectid$=$32BA9724-F1F6-975E-7FCE50709CB4C932.
- 28.The predictive toxicology dataset, at ftp site: ftp://ftp.cs.york.ac.uk/pub/ML_GROUP/Datasets/carcinogenesis.
- 30.Helma, C., Kramer, S., and De Raedt, L. (2002), The molecular feature miner molfea, molecular informatics: Confronting complexity, in Proceedings of the Beilstein-Institut Workshop, Bozen, Italy.Google Scholar