Skip to main content

Semantic Views of Homogeneous Unstructured Data

  • Conference paper
  • First Online:
Web Reasoning and Rule Systems (RR 2015)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9209))

Included in the following conference series:

Abstract

Homogeneous unstructured data (HUD) are collections of unstructured documents that share common properties, such as similar layout, common file format, or common domain of values. Building on such properties, it would be desirable to automatically process HUD to access the main information through a semantic layer – typically an ontology – called semantic view. Hence, we propose an ontology-based approach for extracting semantically rich information from HUD, by integrating and extending recent technologies and results from the fields of classical information extraction, table recognition, ontologies, text annotation, and logic programming. Moreover, we design and implement a system, named KnowRex, that has been successfully applied to curriculum vitae in the Europass style to offer a semantic view of them, and be able, for example, to select those which exhibit required skills.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 34.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 44.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Anantharangachar, R., Ramani, S., Rajagopalan, S.: Ontology guided information extraction from unstructured text. CoRR abs/1302.1335 (2013)

    Google Scholar 

  2. Balke, W.T.: Introduction to information extraction: basic notions and current trends. Datenbank-Spektrum 12(2), 81–88 (2012)

    Article  Google Scholar 

  3. Chang, C.H., Kayed, M., Girgis, M.R., Shaalan, K.F.: A survey of web information extraction systems. IEEE Trans. Kn. Data Eng. 18(10), 1411–1428 (2006)

    Article  Google Scholar 

  4. Chen, L., Ortona, S., Orsi, G., Benedikt, M.: Aggregating semantic annotators. In: Proceedings VLDB Endow, vol. 6 no. 13, pp. 1486–1497 (2013)

    Google Scholar 

  5. Furche, Tim, Gottlob, Georg, Grasso, Giovanni, Orsi, Giorgio, Schallhart, Christian, Wang, Cheng: Little knowledge rules the web: domain-centric result page extraction. In: Rudolph, Sebastian, Gutierrez, Claudio (eds.) RR 2011. LNCS, vol. 6902, pp. 61–76. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  6. Jiang, J.: Information extraction from text. In: Aggarwal, C.C., Zhai, C.X. (eds.) Mining Text Data, pp. 11–41. Springer, US (2012)

    Chapter  Google Scholar 

  7. Kara, S., Alan, O., Sabuncu, O., Akpinar, S., Cicekli, N.K., Alpaslan, F.N.: An ontology-based retrieval system using semantic indexing. Inf. Syst. 37(4), 294–305 (2012)

    Article  Google Scholar 

  8. Karkaletsis, Vangelis, Fragkou, Pavlina, Petasis, Georgios, Iosif, Elias: Ontology based information extraction from text. In: Paliouras, Georgios, Spyropoulos, Constantine D., Tsatsaronis, George (eds.) Multimedia Information Extraction. LNCS, vol. 6050, pp. 89–109. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  9. Manna, M., Oro, E., Ruffolo, M., Alviano, M., Leone, N.: The H\(\imath \)L\(\varepsilon \)X system for semantic information extraction. Trans. Large-Scale Data- Knowl.-Centered Syst. V 7100, 91–125 (2012)

    Article  Google Scholar 

  10. Mo, Qian, Chen, Yi-hong: Ontology-Based Web Information Extraction. In: Zhao, Maotai, Sha, Junpin (eds.) ICCIP 2012, Part I. CCIS, vol. 288, pp. 118–126. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  11. Ricca, F., Leone, N.: Disjunctive logic programming with types and objects: The DLV\(^{+}\) system. J. Appl. Logic 5(3), 545–573 (2007)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The work has been supported by Regione Calabria, programme POR Calabria FESR 2007–2013, within project “KnowRex: Un sistema per il riconoscimento e l’estrazione di conoscenza”.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marco Manna .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Adrian, W.T., Leone, N., Manna, M. (2015). Semantic Views of Homogeneous Unstructured Data. In: ten Cate, B., Mileo, A. (eds) Web Reasoning and Rule Systems. RR 2015. Lecture Notes in Computer Science(), vol 9209. Springer, Cham. https://doi.org/10.1007/978-3-319-22002-4_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-22002-4_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-22001-7

  • Online ISBN: 978-3-319-22002-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics