Skip to main content

An Inductive Learning System for XML Documents

  • Conference paper
Inductive Logic Programming (ILP 2007)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4894))

Included in the following conference series:

  • 563 Accesses

Abstract

This paper presents a complete inductive learning system that aims to produce comprehensible theories for XML document classifications. The knowledge representation method is based on a higher-order logic formalism which is particularly suitable for structured-data learning systems. A systematic way of generating predicates is also given. The learning algorithm of the system is a modified standard decision-tree learning algorithm driven by predicate/recall breakeven point. Experimental results on XML version of Reuters dataset show that this system is able to produce comprehensible theories with high precision/recall breakeven point values.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Dagan, I., Karov, Y., Roth, D.: Mistake-driven learning in text categorization. In: Proceedings of the Second Conference on Empirical Methods in Natural Language Processing, AAAI Press, Menlo Park (1997)

    Google Scholar 

  2. Dumais, S., Platt, J., Heckerman, D., Sahami, M.: Inductive learning algorithms and representations for text categorization. In: Proceedings of the Seventh International Conference on Information and Knowledge Management, pp. 148–155 (1998)

    Google Scholar 

  3. Dumais, S.T., Chen, H.: Hierarchical classification of web content. In: Proceedings of ACM-SIGIR International Conference on Research and Development in Information Retrieval, Athens, pp. 256–263 (2000)

    Google Scholar 

  4. Joachims, T.: A probabilistic analysis of the Rocchio algorithm with TFIDF for text categorization. In: Proceedings of ICML 1997, 14th International Conference on Machine Learning (1997)

    Google Scholar 

  5. Lewis, D., Ringuette, M.: A comparison of two learning algorithms for text categorization. In: Proceedings of SDAIR 1994, 3rd Annual Symposium on Document Analysis and Information Retrieval (1994)

    Google Scholar 

  6. Lloyd, J.W.: Logic for Learning: Learning Comprehensible Theories from Structured Data. Springer, Heidelberg (2003)

    MATH  Google Scholar 

  7. Sebastiani, F.: A tutorial on automated text categorisation. In: Proceedings of ASAI 1999, First Argentinian Symposium on Artificial Intelligence, Buenos Aires, AR, pp. 7–35 (1999)

    Google Scholar 

  8. van Rijsbergen, C.J.: Information Retrieval. Butterworths, London (1979)

    Google Scholar 

  9. Wu, X.: Knowledge Representation and Learning For Semistructured Data. PhD thesis, The Australian National University (2006)

    Google Scholar 

  10. Yang, Y.: An evaluation of statistical approaches to text categorization. ACM Transactions on Information Systems 12(3), 296–333 (1998)

    Google Scholar 

  11. Yang, Y., Pedersen, J.O.: A comparative study on feature selection in text categorization. In: Proceedings of ICML 1997, 14th International Conference on Machine Learning, Nashville, TX, Fisher, D.H. (eds.). pp. 412–420 (1997)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Hendrik Blockeel Jan Ramon Jude Shavlik Prasad Tadepalli

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wu, X. (2008). An Inductive Learning System for XML Documents. In: Blockeel, H., Ramon, J., Shavlik, J., Tadepalli, P. (eds) Inductive Logic Programming. ILP 2007. Lecture Notes in Computer Science(), vol 4894. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78469-2_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-78469-2_28

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-78468-5

  • Online ISBN: 978-3-540-78469-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics