AND Parallelism for ILP: The APIS System

Camacho, Rui; Ramos, Ruy; Fonseca, Nuno A.

doi:10.1007/978-3-662-44923-3_7

Rui Camacho⁷,
Ruy Ramos⁷ &
Nuno A. Fonseca⁸

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8812))

Included in the following conference series:

International Conference on Inductive Logic Programming

668 Accesses
1 Citations

Abstract

Inductive Logic Programming (ILP) is a well known approach to Multi-Relational Data Mining. ILP systems may take a long time for analyzing the data mainly because the search (hypotheses) spaces are often very large and the evaluation of each hypothesis, which involves theorem proving, may be quite time consuming in some domains. To address these efficiency issues of ILP systems we propose the APIS (And ParallelISm for ILP) system that uses results from Logic Programming AND-parallelism. The approach enables the partition of the search space into sub-spaces of two kinds: sub-spaces where clause evaluation requires theorem proving; and sub-spaces where clause evaluation is performed quite efficiently without resorting to a theorem prover. We have also defined a new type of redundancy (Coverage-equivalent redundancy) that enables the prune of significant parts of the search space. The new type of pruning together with the partition of the hypothesis space considerably improved the performance of the APIS system. An empirical evaluation of the APIS system in standard ILP data sets shows considerable speedups without a lost of accuracy of the models constructed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 34.99; Price excludes VAT (USA)

Softcover Book: USD 44.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Counting the number of examples derivable from the hypothesis and the background knowledge.
2.
As many as the size of the sample.
3.
Opposite from what happens when literals share variables.
4.
Source data for both data sets is available from the Distributed Structure-Searchable Toxicity (DSSTox) Public Data Base Network from the U.S. Environmental Protection Agency http://www.epa.gov/ncct/dsstox/index.html,accessed Dec 2008.
5.
http://www.cs.ox.ac.uk/activities/machlearn/applications.html
6.
Except for the carcinogenesis data set.

References

Bone, P., Somogyi, Z., Schachte, P.: Estimating the overlap between dependent computations for automatic parallelization. TPLP 11(4–5), 575–591 (2011)
MathSciNet MATH Google Scholar
Camacho, R.: IndLog — induction in logic. In: Alferes, J.J., Leite, J. (eds.) JELIA 2004. LNCS (LNAI), vol. 3229, pp. 718–721. Springer, Heidelberg (2004)
Chapter Google Scholar
Camacho, R., Pereira, M., Costa, V.S., Fonseca, N.A., Adriano, C., Simoes, C.J.V., Brito, R.M.M.: A relational learning approach to structure-activity relationships in drug design toxicity studies. J. Integr. Bioinform. 8(3), 182 (2011)
Google Scholar
Casas, A., Carro, M., Hermenegildo, M.V.: A high-level implementation of non-deterministic, unrestricted, independent and-parallelism. In: Garcia de la Banda, M., Pontelli, E. (eds.) ICLP 2008. LNCS, vol. 5366, pp. 651–666. Springer, Heidelberg (2008)
Chapter Google Scholar
Clare, A.J., King, R.D.: Data mining the yeast genome in a lazy functional language. In: Dahl, V. (ed.) PADL 2003. LNCS, vol. 2562, pp. 19–36. Springer, Heidelberg (2002)
Chapter Google Scholar
Costa, V.S., de Castro Dutra, I., Rocha, R.: Threads and or-parallelism unified. TPLP 10(4–6), 417–432 (2010)
MATH Google Scholar
Dehaspe, L., De Raedt, L.: Parallel inductive logic programming. In: Proceedings of the MLnet Familiarization Workshop on Statistics, Machine Learning and Knowledge Discovery in Databases (1995)
Google Scholar
Fonseca, N.A., Costa, V.S., Rocha, R., Camacho, R., Silva, F.: Improving the efficiency of inductive logic programming systems. Softw. Pract. Exper. 39(2), 189–219 (2009)
Article Google Scholar
Fonseca, N.A., Silva, F., Camacho, R.: April – an inductive logic programming system. In: Fisher, M., van der Hoek, W., Konev, B., Lisitsa, A. (eds.) JELIA 2006. LNCS (LNAI), vol. 4160, pp. 481–484. Springer, Heidelberg (2006)
Chapter Google Scholar
Fonseca, N.A., Srinivasan, A., Silva, F.M.A., Camacho, R.: Parallel ilp for distributed-memory architectures. Mach. Learn. 74(3), 257–279 (2009)
Article Google Scholar
The MPI Forum: Mpi: a message passing interface (1993)
Google Scholar
Gupta, G., Pontelli, E., Ali, K.A.M., Carlsson, M., Hermenegildo, M.V.: Parallel execution of prolog programs: a survey. ACM Trans. Program. Lang. Syst. 23(4), 472–602 (2001)
Article Google Scholar
Matsui, T., Inuzuka, N., Seki, H., Itoh, H.: Comparison of three parallel implementations of an induction algorithm. In: 8th International Parallel Computing Workshop, Singapore, pp. 181–188 (1998)
Google Scholar
Moura, P., Crocker, P., Nunes, P.: High-level multi-threading programming in logtalk. In: Hudak, P., Warren, D.S. (eds.) PADL 2008. LNCS, vol. 4902, pp. 265–281. Springer, Heidelberg (2008)
Chapter Google Scholar
Muggleton, S.: Inverse entailment and Progol. New Gener. Comput., Spec. Issue Induct. Log. Program. 13(3–4), 245–286 (1995)
Article Google Scholar
Muggleton, S., Firth, J.: Relational rule induction with CProgol4.4: a tutorial introduction. In: Džeroski, S., Lavrač, N. (eds.) Relational Data Mining, pp. 160–188. Springer, Heidelberg (2001)
Chapter Google Scholar
Ohwada, H., Mizoguchi, F.: Parallel execution for speeding up inductive logic programming systems. In: Arikawa, S., Nakata, I. (eds.) DS 1999. LNCS (LNAI), vol. 1721, pp. 277–286. Springer, Heidelberg (1999)
Chapter Google Scholar
Ohwada, H., Nishiyama, H., Mizoguchi, F.: Concurrent execution of optimal hypothesis search for inverse entailment. In: Cussens, J., Frisch, A.M. (eds.) ILP 2000. LNCS (LNAI), vol. 1866, pp. 165–173. Springer, Heidelberg (2000)
Chapter Google Scholar
Costa, V.S., Srinivasan, A., Camacho, R., Blockeel, H., Demoen, B., Janssens, G., Struyf, J., Vandecasteele, H., Van Laer, W.: Query transformations for improving the efficiency of ILP systems. J. Mach. Learn. Res. 4, 465–491 (2003)
Google Scholar
Camacho, R., Blockeel, H., Demoen, B., Janssens, G., Struyf, J., Vandecasteele, H., van Laer, W.: Query Transformations for Improving the Efficiency of ILP Systems. J. Mach. Learning Res. Ashwin Srinivasan 4, 465–491 (2003)
Google Scholar
Skillicorn, D.B., Wang, Y.: Parallel and sequential algorithms for data mining using inductive logic. Knowl. Inf. Syst. 3(4), 405–421 (2001)
Article MATH Google Scholar
Srinivasan, A.: The Aleph Manual (2003). http://web.comlab.ox.ac.uk/oucl/research/areas/machlearn/Aleph
Wielemaker, J.: Native preemptive threads in SWI-prolog. In: Palamidessi, C. (ed.) ICLP 2003. LNCS, vol. 2916, pp. 331–345. Springer, Heidelberg (2003)
Chapter Google Scholar
Woo, Y.T., Lai, D., McLain, J.L., Manibusan, M.K., Dellarco, V.: Use of mechanism-based structure-activity relationships analysis in carcinogenic potential ranking for drinking water disinfection by-products. Environ. Health Perspect. 110, 75–87 (2002)
Article Google Scholar

Download references

Acknowledgments

This work has been partially supported by Fundação para a Ciência e Tecnologia (FCT) through the project ADE (PTDC/EIA-EIA/121686/2010 (FCOMP-01-0124-FEDER-020575)). The work was also partial supported by project NORTE-07-0124-FEDER-000059, financed by the North Portugal Regional Operational Programme (ON.2 O Novo Norte), under the National Strategic Reference Framework (NSRF), through the European Regional Development Fund (ERDF), and by national funds, through the Portuguese funding agency, FCT.

Author information

Authors and Affiliations

DEI and Faculdade de Engenharia and LIAAD-INESCTEC, Universidade do Porto, Porto, Portugal
Rui Camacho & Ruy Ramos
EMBL Outstation, European Bioinformatics Institute (EBI) and CRACS-INESCTEC, Cambridge, UK
Nuno A. Fonseca

Authors

Rui Camacho
View author publications
You can also search for this author in PubMed Google Scholar
Ruy Ramos
View author publications
You can also search for this author in PubMed Google Scholar
Nuno A. Fonseca
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rui Camacho .

Editor information

Editors and Affiliations

University of Rio de Janeiro, Rio de Janeiro, Rio de Janeiro, Brazil
Gerson Zaverucha
University of Porto, Porto, Portugal
Vítor Santos Costa
Fluminense Federal University, Niterói, Rio de Janeiro, Brazil
Aline Paes

A Composition of the Dataset’s Islands

Table 5 shows the partial composition of the islands that where used to define the hypothesis sub-spaces. In the table we show only the predicates that appear in the models constructed in the sequential execution runs.

Table 5. Island’s membership of the predicates that appear in the final theories induced by the APIS system.

Full size table

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Camacho, R., Ramos, R., Fonseca, N.A. (2014). AND Parallelism for ILP: The APIS System. In: Zaverucha, G., Santos Costa, V., Paes, A. (eds) Inductive Logic Programming. ILP 2013. Lecture Notes in Computer Science(), vol 8812. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-44923-3_7

Download citation

DOI: https://doi.org/10.1007/978-3-662-44923-3_7
Published: 24 September 2014
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-44922-6
Online ISBN: 978-3-662-44923-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

AND Parallelism for ILP: The APIS System

Abstract

Access this chapter

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

A Composition of the Dataset’s Islands

A Composition of the Dataset’s Islands

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation