Skip to main content

An Unsupervised Rule-Based Method to Populate Ontologies from Text

  • Conference paper
Web Information Systems and Technologies (WEBIST 2009)

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 45))

Included in the following conference series:

  • 496 Accesses

Abstract

An increasing amount of information is available on the web and usually is expressed as text. Semantic information is implicit in these texts, since they are mainly intended for human consumption and interpretation. Because unstructured information is not easily handled automatically, an information extraction process has to be used to identify concepts and establish relations among them. Ontologies are an appropriate way to represent structured knowledge bases, enabling sharing, reuse and inference. In this paper, an information extraction process is used for populating a domain ontology. It targets Brazilian Portuguese texts from a biographical dictionary of music, which requires specific tools due to some language unique aspects. An unsupervised rule-based method is proposed. Through this process, latent concepts and relations expressed in natural language can be extracted and represented as an ontology, allowing new uses and visualizations of the content, such as semantically browsing and inferring new knowledge.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abiteboul, S., Buneman, P., Suciu, D.: Data on the Web. Morgan Kaufman, San Francisco (2000)

    Google Scholar 

  2. Ahn, D., Van Rantwijk, J., De Rijke, M.: A Cascaded Machine Learning Approach to Interpreting Temporal Expressions. In: Proceedings of NAACL HLT 2007, Rochester, NY, pp. 420–427 (2007)

    Google Scholar 

  3. Albin, R.: Dicionário Cravo Albin da Música Popular Brasileira, http://www.dicionariompb.com.br

  4. Allen, J.: Time and Time Again - The Many Ways to Represent Time. International Journal of Intelligent Systems 6 (1991)

    Google Scholar 

  5. Branco, A., Silva, J.: A Suite of Shallow Processing Tools for Portuguese: LX-Suite. In: Proceedings of 11th Conference of the European Chapter of Association for Computational Linguistics, pp. 179–182 (2006)

    Google Scholar 

  6. Cardoso, J.: The Semantic Web Vision: Where are We. IEEE Intelligent Systems, 22–26 (September/October 2007)

    Google Scholar 

  7. Chang, C., Kayed, M., Girgis, M., Shaalan, K.: A Survey of Web Information Extraction Systems. IEEE Transaction on Knowledge and Data Engineering 18(10), 1411–1428 (2006)

    Article  Google Scholar 

  8. Chaves, A., Rino, L.: The Mitkov Algorithm for Anaphora Resolution in Portuguese. In: Teixeira, A., de Lima, V.L.S., de Oliveira, L.C., Quaresma, P. (eds.) PROPOR 2008. LNCS (LNAI), vol. 5190, pp. 51–60. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  9. Cimiano, P., Völker, J.: Towards large-scale open-domain and ontology-based named entity classification. In: Proceedings of RANLP 2005, Borovets, Bulgaria, pp. 166–172 (2005)

    Google Scholar 

  10. CliqueMusic, http://cliquemusic.uol.com.br

  11. Feldman, R., Sanger, J.: The Text Mining Handbook - Advanced Approaches in Analyzing Unstructured Data. Cambridge University Press, Cambridge (2007)

    Google Scholar 

  12. Giasson, F., Raimond, Y.: Music Ontology Specification (2008), http://musicontology.com

  13. Graça, J., Mamede, N., Pereira, J.: A framework for Integrating Natural Language Tools. In: Vieira, R., Quaresma, P., Nunes, M.d.G.V., Mamede, N.J., Oliveira, C., Dias, M.C. (eds.) PROPOR 2006. LNCS (LNAI), vol. 3960, pp. 110–119. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  14. Gruber, T.: Ontology. In: Liu, L., Tamer Özsu, M. (eds.) Encyclopedia of Database Systems. Springer, Heidelberg (2008)

    Google Scholar 

  15. Haarslev, V., Möller, R.: Racer: An OWL Reasoning Agent for the Semantic Web. In: Proceedings of the International Workshop on Applications, Products and Services of Web-based Support Systems, in conjunction with 2003 IEEE/WIC International Conference on Web Intelligence, Halifax Canada, October 13, pp. 91–95 (2003)

    Google Scholar 

  16. Haase, P., Völker, J.: Ontology learning and reasoning - dealing with uncertainty and inconsistency. In: da Costa, P.C.G., d’Amato, C., Fanizzi, N., Laskey, K.B., Laskey, K.J., Lukasiewicz, T., Nickles, M., Pool, M. (eds.) URSW 2005 - 2007. LNCS (LNAI), vol. 5327, pp. 366–384. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  17. Hearst, M.: Automatic acquisition of hyponyms from large text corpora. In: Proceedings of the 14th International Conference on Computational Linguistics (COLING), pp. 539–545 (1992)

    Google Scholar 

  18. Kaiser, K., Miksch, S.: Information Extraction - A Survey. Technical Report Asgaard-TR-2005-6, Vienna University of Technology, Vienna, Austria (2005)

    Google Scholar 

  19. Knublauch, H.: Protégé-OWL API Programmer’s Guide (2006), http://protege.stanford.edu/plugins/owl/api/guide.html

  20. Mani, I., Wilson, G.: Temporal Granularity and Temporal Tagging of Text. In: AAAI 2000 Workshop on Spatial and Temporal Granularity, Austin, TX (2000)

    Google Scholar 

  21. Moens, M.-F.: Information Extraction: Algorithms and Prospects in a Retrieval Context. Springer, Heidelberg (2006)

    MATH  Google Scholar 

  22. Muniz, M., Nunes, M., Laporte, E.: UNITEX-PB, a set of flexible language resources for Brazilian Portuguese. In: Proceedings of the Workshop on Technology on Information and Human Language (TIL), São Leopoldo, Brazil (2005)

    Google Scholar 

  23. Protégé, http://protege.stanford.edu

  24. Quan, D., Karger, D.: How to make a semantic web browser. In: Proceedings of the 13th international conference on World Wide Web (2004)

    Google Scholar 

  25. Tanev, H., Magnini, B.: Weakly Supervised Approaches for Ontology Population. In: Proceedings of 11th Conference of the European Chapter of the Association for Computational Linguistics: EACL 2006 (2006)

    Google Scholar 

  26. Yildiz, B., Miksch, S.: Motivating Ontology-Driven Information Extraction. In: Proceedings of the International Conference on Semantic Web and Digital Libraries, ICSD 2007 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Motta, E., Siqueira, S., Andreatta, A. (2010). An Unsupervised Rule-Based Method to Populate Ontologies from Text. In: Cordeiro, J., Filipe, J. (eds) Web Information Systems and Technologies. WEBIST 2009. Lecture Notes in Business Information Processing, vol 45. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12436-5_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12436-5_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12435-8

  • Online ISBN: 978-3-642-12436-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics