Skip to main content

AND Parallelism for ILP: The APIS System

  • Conference paper
  • First Online:
Inductive Logic Programming (ILP 2013)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8812))

Included in the following conference series:

Abstract

Inductive Logic Programming (ILP) is a well known approach to Multi-Relational Data Mining. ILP systems may take a long time for analyzing the data mainly because the search (hypotheses) spaces are often very large and the evaluation of each hypothesis, which involves theorem proving, may be quite time consuming in some domains. To address these efficiency issues of ILP systems we propose the APIS (And ParallelISm for ILP) system that uses results from Logic Programming AND-parallelism. The approach enables the partition of the search space into sub-spaces of two kinds: sub-spaces where clause evaluation requires theorem proving; and sub-spaces where clause evaluation is performed quite efficiently without resorting to a theorem prover. We have also defined a new type of redundancy (Coverage-equivalent redundancy) that enables the prune of significant parts of the search space. The new type of pruning together with the partition of the hypothesis space considerably improved the performance of the APIS system. An empirical evaluation of the APIS system in standard ILP data sets shows considerable speedups without a lost of accuracy of the models constructed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 34.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 44.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Counting the number of examples derivable from the hypothesis and the background knowledge.

  2. 2.

    As many as the size of the sample.

  3. 3.

    Opposite from what happens when literals share variables.

  4. 4.

    Source data for both data sets is available from the Distributed Structure-Searchable Toxicity (DSSTox) Public Data Base Network from the U.S. Environmental Protection Agency http://www.epa.gov/ncct/dsstox/index.html,accessed Dec 2008.

  5. 5.

    http://www.cs.ox.ac.uk/activities/machlearn/applications.html

  6. 6.

    Except for the carcinogenesis data set.

References

  1. Bone, P., Somogyi, Z., Schachte, P.: Estimating the overlap between dependent computations for automatic parallelization. TPLP 11(4–5), 575–591 (2011)

    MathSciNet  MATH  Google Scholar 

  2. Camacho, R.: IndLog — induction in logic. In: Alferes, J.J., Leite, J. (eds.) JELIA 2004. LNCS (LNAI), vol. 3229, pp. 718–721. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  3. Camacho, R., Pereira, M., Costa, V.S., Fonseca, N.A., Adriano, C., Simoes, C.J.V., Brito, R.M.M.: A relational learning approach to structure-activity relationships in drug design toxicity studies. J. Integr. Bioinform. 8(3), 182 (2011)

    Google Scholar 

  4. Casas, A., Carro, M., Hermenegildo, M.V.: A high-level implementation of non-deterministic, unrestricted, independent and-parallelism. In: Garcia de la Banda, M., Pontelli, E. (eds.) ICLP 2008. LNCS, vol. 5366, pp. 651–666. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  5. Clare, A.J., King, R.D.: Data mining the yeast genome in a lazy functional language. In: Dahl, V. (ed.) PADL 2003. LNCS, vol. 2562, pp. 19–36. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  6. Costa, V.S., de Castro Dutra, I., Rocha, R.: Threads and or-parallelism unified. TPLP 10(4–6), 417–432 (2010)

    MATH  Google Scholar 

  7. Dehaspe, L., De Raedt, L.: Parallel inductive logic programming. In: Proceedings of the MLnet Familiarization Workshop on Statistics, Machine Learning and Knowledge Discovery in Databases (1995)

    Google Scholar 

  8. Fonseca, N.A., Costa, V.S., Rocha, R., Camacho, R., Silva, F.: Improving the efficiency of inductive logic programming systems. Softw. Pract. Exper. 39(2), 189–219 (2009)

    Article  Google Scholar 

  9. Fonseca, N.A., Silva, F., Camacho, R.: April – an inductive logic programming system. In: Fisher, M., van der Hoek, W., Konev, B., Lisitsa, A. (eds.) JELIA 2006. LNCS (LNAI), vol. 4160, pp. 481–484. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  10. Fonseca, N.A., Srinivasan, A., Silva, F.M.A., Camacho, R.: Parallel ilp for distributed-memory architectures. Mach. Learn. 74(3), 257–279 (2009)

    Article  Google Scholar 

  11. The MPI Forum: Mpi: a message passing interface (1993)

    Google Scholar 

  12. Gupta, G., Pontelli, E., Ali, K.A.M., Carlsson, M., Hermenegildo, M.V.: Parallel execution of prolog programs: a survey. ACM Trans. Program. Lang. Syst. 23(4), 472–602 (2001)

    Article  Google Scholar 

  13. Matsui, T., Inuzuka, N., Seki, H., Itoh, H.: Comparison of three parallel implementations of an induction algorithm. In: 8th International Parallel Computing Workshop, Singapore, pp. 181–188 (1998)

    Google Scholar 

  14. Moura, P., Crocker, P., Nunes, P.: High-level multi-threading programming in logtalk. In: Hudak, P., Warren, D.S. (eds.) PADL 2008. LNCS, vol. 4902, pp. 265–281. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  15. Muggleton, S.: Inverse entailment and Progol. New Gener. Comput., Spec. Issue Induct. Log. Program. 13(3–4), 245–286 (1995)

    Article  Google Scholar 

  16. Muggleton, S., Firth, J.: Relational rule induction with CProgol4.4: a tutorial introduction. In: Džeroski, S., Lavrač, N. (eds.) Relational Data Mining, pp. 160–188. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  17. Ohwada, H., Mizoguchi, F.: Parallel execution for speeding up inductive logic programming systems. In: Arikawa, S., Nakata, I. (eds.) DS 1999. LNCS (LNAI), vol. 1721, pp. 277–286. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  18. Ohwada, H., Nishiyama, H., Mizoguchi, F.: Concurrent execution of optimal hypothesis search for inverse entailment. In: Cussens, J., Frisch, A.M. (eds.) ILP 2000. LNCS (LNAI), vol. 1866, pp. 165–173. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  19. Costa, V.S., Srinivasan, A., Camacho, R., Blockeel, H., Demoen, B., Janssens, G., Struyf, J., Vandecasteele, H., Van Laer, W.: Query transformations for improving the efficiency of ILP systems. J. Mach. Learn. Res. 4, 465–491 (2003)

    Google Scholar 

  20. Camacho, R., Blockeel, H., Demoen, B., Janssens, G., Struyf, J., Vandecasteele, H., van Laer, W.: Query Transformations for Improving the Efficiency of ILP Systems. J. Mach. Learning Res. Ashwin Srinivasan 4, 465–491 (2003)

    Google Scholar 

  21. Skillicorn, D.B., Wang, Y.: Parallel and sequential algorithms for data mining using inductive logic. Knowl. Inf. Syst. 3(4), 405–421 (2001)

    Article  MATH  Google Scholar 

  22. Srinivasan, A.: The Aleph Manual (2003). http://web.comlab.ox.ac.uk/oucl/research/areas/machlearn/Aleph

  23. Wielemaker, J.: Native preemptive threads in SWI-prolog. In: Palamidessi, C. (ed.) ICLP 2003. LNCS, vol. 2916, pp. 331–345. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  24. Woo, Y.T., Lai, D., McLain, J.L., Manibusan, M.K., Dellarco, V.: Use of mechanism-based structure-activity relationships analysis in carcinogenic potential ranking for drinking water disinfection by-products. Environ. Health Perspect. 110, 75–87 (2002)

    Article  Google Scholar 

Download references

Acknowledgments

This work has been partially supported by Fundação para a Ciência e Tecnologia (FCT) through the project ADE (PTDC/EIA-EIA/121686/2010 (FCOMP-01-0124-FEDER-020575)). The work was also partial supported by project NORTE-07-0124-FEDER-000059, financed by the North Portugal Regional Operational Programme (ON.2 O Novo Norte), under the National Strategic Reference Framework (NSRF), through the European Regional Development Fund (ERDF), and by national funds, through the Portuguese funding agency, FCT.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rui Camacho .

Editor information

Editors and Affiliations

A Composition of the Dataset’s Islands

A Composition of the Dataset’s Islands

Table 5 shows the partial composition of the islands that where used to define the hypothesis sub-spaces. In the table we show only the predicates that appear in the models constructed in the sequential execution runs.

Table 5. Island’s membership of the predicates that appear in the final theories induced by the APIS system.

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Camacho, R., Ramos, R., Fonseca, N.A. (2014). AND Parallelism for ILP: The APIS System. In: Zaverucha, G., Santos Costa, V., Paes, A. (eds) Inductive Logic Programming. ILP 2013. Lecture Notes in Computer Science(), vol 8812. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-44923-3_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-44923-3_7

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-44922-6

  • Online ISBN: 978-3-662-44923-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics