Skip to main content

Information Extraction from Concise Passages of Natural Language Sources

  • Conference paper
  • 779 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6295))

Abstract

This paper will present a semi-automated approach for information extraction for ontology construction. The sources used are short news extracts syndicated online. These are used because they contain short passages which provide information in a concise and precise manner. The shortness of the passage significantly reduces the problems of word sense disambiguation. The main goal of knowledge extraction is a semi-automated approach to ontology construction.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Davies, J., Studer, R., Warren, P.: Semantic Web Technologies: Trends and Research in Ontology-based Systems. John Wiley & Sons Ltd., Great Britain (2006)

    Book  Google Scholar 

  2. RSS 2.0 Specification, http://www.rssboard.org/rss-specification

  3. Extensible Markup Language (XML) 1.0, http://www.w3.org/TR/REC-xml/

  4. Heydon, A., Najork, M.: A scalable extensible web crawler. In: Proceedings of the Eight World Wide Web Conference, pp. 219–229 (1999)

    Google Scholar 

  5. Brewington, B.E., Cybenko, G.: How Dynamic is the Web. In: Proceedings of the Ninth International World Wide Web Conference, pp. 257–276 (2000)

    Google Scholar 

  6. Chakrabarti, S., van den Berg, M., Dom, B.: Focused crawling: a new approach to topic-specific Web resource discovery. In: Proceedings of the Eight International Conference on World Wide Web, pp. 1623–1640 (1999)

    Google Scholar 

  7. Grefenstette, G., Tapanainen, P.: What is a word, what is a sentence? Problems of tokenization. In: 3rd International Conference on Computer Lexicography, pp. 79–87 (1994)

    Google Scholar 

  8. Meir, R., Rätsch, G.: An introduction to boosting and leveraging. In: Mendelson, S., Smola, A.J. (eds.) Advanced Lectures on Machine Learning. LNCS (LNAI), vol. 2600, pp. 118–183. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  9. Brants, T.: TnT – A Statistical Part-of-Speech Tagger. In: Proceedings of the Sixth Conference on Applied Natural Language Processing, pp. 224–231 (2000)

    Google Scholar 

  10. Ehrig, M., Haase, P., Hefke, M., Stojanovic, N.: Similarity for ontologies – A comprehensive framework. In: Proceedings of the 13th European Conference on Information Systems (2004)

    Google Scholar 

  11. Navigli, R.: Word Sense Disambiguation: A Survey. ACM Comput. Surv. 41(2), 1–69 (2009)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Pohorec, S., Verlič, M., Zorman, M. (2010). Information Extraction from Concise Passages of Natural Language Sources. In: Catania, B., Ivanović, M., Thalheim, B. (eds) Advances in Databases and Information Systems. ADBIS 2010. Lecture Notes in Computer Science, vol 6295. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15576-5_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-15576-5_35

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-15575-8

  • Online ISBN: 978-3-642-15576-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics