Skip to main content

L-Modified ILP Evaluation Functions for Positive-Only Biological Grammar Learning

  • Conference paper
Inductive Logic Programming (ILP 2008)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5194))

Included in the following conference series:

Abstract

We identify a shortcoming of a standard positive-only clause evaluation function within the context of learning biological grammars. To overcome this shortcoming we propose L-modification, a modification to this evaluation function such that the lengths of individual examples are considered. We use a set of bio-sequences known as neuropeptide precursor middles (NPP-middles). Using L-modification to learn from these NPP-middles results in induced grammars that have a better performance than that achieved when using the standard positive-only clause evaluation function. We also show that L-modification improves the performance of induced grammars when learning on short, medium or long NPPs-middles. A potential disadvantage of L-modification is discussed. Finally, we show that, as the limit on the search space size increases, the greater is the increase in predictive performance arising from L-modification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bryant, C.H., Fredouille, D.: A parser for the efficient induction of biological grammars. In: Kramer, S., Pfahringer, B. (eds.) 15th International Conference on Inductive Logic Programming: late-breaking paper track, pp. 3–8. University of Bonn, Bonn (July 2005), http://wwwbib.informatik.tu-muenchen.de/infberichte/2005/TUM-I0510.idx

    Google Scholar 

  2. Bryant, C.H., Fredouille, D., Wilson, A., Jayawickreme, C.K., Jupe, S., Topp, S.: Pertinent background knowledge for learning protein grammars. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 54–65. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  3. Fredouille, D., Bryant, C.H., Jayawickreme, C.K., Jupe, S., Topp, S.: An ILP refinement operator for biological grammar learning. In: Muggleton, S., Otero, R., Tamaddoni-Nezhad, A. (eds.) ILP 2006. LNCS (LNAI), vol. 4455, pp. 214–228. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  4. Muggleton, S., King, R.D., Sternberg, M.J.E.: Protein secondary structure prediction using logic-based machine learning. Protein Engineering Oxford 5(7), 647 (1992)

    Article  Google Scholar 

  5. Muggleton, S., Srinivasan, A., Bain, M.: Compression, significance and accuracy. In: Sleeman, D., Edwards, P. (eds.) Proceedings of the Ninth International Machine Learning Conference, pp. 338–347. Morgan Kaufmann, San Francisco (1992)

    Google Scholar 

  6. Muggleton, S.H.: Inverse entailment and Progol. New Generation Computing 13, 245–286 (1995)

    Article  Google Scholar 

  7. Muggleton, S.H.: Learning from positive data. In: Muggleton, S.H. (ed.) ILP 1996. LNCS, vol. 1314, pp. 358–376. Springer, Heidelberg (1997)

    Google Scholar 

  8. Muggleton, S.H., Bryant, C.H., Srinivasan, A., Whittaker, A., Topp, S., Rawlings, C.: Are grammatical representations useful for learning from biological sequence data? - a case study. Journal of Computational Biology 8(5), 493–522 (2001)

    Article  Google Scholar 

  9. Pereira, F., Warren, D.: Definite clause grammars for language analysis. Readings in natural language processing, pp. 101–124 (1986)

    Google Scholar 

  10. Rissanen, J.J.: Modeling by shortest data description. Automatica 14, 465–471 (1978)

    Article  MATH  Google Scholar 

  11. Searls, D.B.: Linguistic approaches to biological sequences. Computer Applications in the Biosciences 13(4), 333–344 (1997)

    Google Scholar 

  12. Srinivasan, A.: A learning engine for proposing hypotheses (Aleph) (1993), http://web.comlab.ox.ac.uk/oucl/research/areas/machlearn/Aleph

  13. Srinivasan, A., Muggleton, S., Bain, M.: The justification of logical theories based on data compression. Machine Intelligence 13, 91–125 (1994)

    Google Scholar 

  14. Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Filip Železný Nada Lavrač

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Mamer, T., Bryant, C.H., McCall, J. (2008). L-Modified ILP Evaluation Functions for Positive-Only Biological Grammar Learning. In: Železný, F., Lavrač, N. (eds) Inductive Logic Programming. ILP 2008. Lecture Notes in Computer Science(), vol 5194. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85928-4_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-85928-4_16

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-85927-7

  • Online ISBN: 978-3-540-85928-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics