Skip to main content

CRISOL: An Approach for Automatically Populating Semantic Web from Unstructured Text Collections

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3180))

Abstract

Currently, the main drawback for the development of the Semantic Web stems from the manual tagging of web pages according to a given ontology that conceptualizes its domain. This tasks is usually hard, even for experts, and it is prone to errors due to the different interpretations users can have about the same documents. In this paper we address the problem of automatically gene rating ontology instances starting from a collection of unstructured documents (e.g. plain texts, HTML pages, etc.). These instances will populate the Semantic Web that is described by the ontology. The proposed approach combines Information Extraction tec hniques, mainly entity recognition, information merging and Text Mining techniques. This approach has been successfully applied in the development of a Semantic Web for the Archaeology Research.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Berners-Lee, T., Hendler, J., Lassila, O.: The Semantic Web. Scientific American (2001)

    Google Scholar 

  2. Forno, F., Farinetti, L., Mehan, S.: Can Data Mining Techniques Ease The Semantic Tagging Burden? In: SWDB 2003, pp. 277–292 (2003)

    Google Scholar 

  3. Doan, A., et al.: Learning to match ontologies on the Semantic Web. VLDB Journal 12(4), 303–319 (2003)

    Article  Google Scholar 

  4. Appelt, D.: Introduction to Information Extraction. AI Communications 12 (1999)

    Google Scholar 

  5. Llidó, D.M., Berlanga, R., Aramburu, M.J.: Extracting Temporal References to Assign Document Event-Time Periods. In: Mayr, H.C., Lazanský, J., Quirchmayr, G., Vogel, P. (eds.) DEXA 2001. LNCS, vol. 2113, pp. 62–71. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  6. Maedche, A., Neumann, G., Staab, S.: Bootstrapping an Ontology based Information Extraction System. In: Studies in Fuzziness and Soft Computing, Springer, Heidelberg (2001)

    Google Scholar 

  7. Danger, R.M., Berlanga, R., Ruiz-Shulcloper, J.: Text Mining using the Hierarchical Syntactical Structure of Documents. In: X Conferencia de la Asociación Española para la Inteligencia Artificial (CAEPIA 2003), pp. 139–144 (2003)

    Google Scholar 

  8. Dirección General del Patrimonio Artístico, http://www.cult.gva.es/dgpa/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Danger, R., Berlanga, R., Rui’z-Shulcloper, J. (2004). CRISOL: An Approach for Automatically Populating Semantic Web from Unstructured Text Collections. In: Galindo, F., Takizawa, M., Traunmüller, R. (eds) Database and Expert Systems Applications. DEXA 2004. Lecture Notes in Computer Science, vol 3180. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30075-5_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30075-5_24

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-22936-0

  • Online ISBN: 978-3-540-30075-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics