SCMS – Semantifying Content Management Systems

Ngonga Ngomo, Axel-Cyrille; Heino, Norman; Lyko, Klaus; Speck, René; Kaltenböck, Martin

doi:10.1007/978-3-642-25093-4_13

SCMS – Semantifying Content Management Systems

Axel-Cyrille Ngonga Ngomo²⁴,
Norman Heino²⁴,
Klaus Lyko²⁴,
René Speck²⁴ &
…
Martin Kaltenböck²⁵

Conference paper

1623 Accesses
16 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7032))

Abstract

The migration to the Semantic Web requires from CMS that they integrate human- and machine-readable data to support their seamless integration into the Semantic Web. Yet, there is still a blatant need for frameworks that can be easily integrated into CMS and allow to transform their content into machine-readable knowledge with high accuracy. In this paper, we describe the SCMS (Semantic Content Management Systems) framework, whose main goals are the extraction of knowledge from unstructured data in any CMS and the integration of the extracted knowledge into the same CMS. Our framework integrates a highly accurate knowledge extraction pipeline. In addition, it relies on the RDF and HTTP standards for communication and can thus be integrated in virtually any CMS. We present how our framework is being used in the energy sector. We also evaluate our approach and show that our framework outperforms even commercial software by reaching up to 96% F-score.

Download to read the full chapter text

Chapter PDF

References

Adrian, B., Hees, J., Herman, I., Sintek, M., Dengel, A.: Epiphany: Adaptable rDFa Generation Linking the Web of Documents to the Web of Data. In: Cimiano, P., Pinto, H.S. (eds.) EKAW 2010. LNCS, vol. 6317, pp. 178–192. Springer, Heidelberg (2010)
Chapter Google Scholar
Agichtein, E., Gravano, L.: Snowball: Extracting relations from large plain-text collections. In: ACM DL, pp. 85–94 (2000)
Google Scholar
Amsler, R.: Research towards the development of a lexical knowledge base for natural language processing. SIGIR Forum 23, 1–2 (1989)
Article Google Scholar
Auer, S., Dietzold, S., Riechert, T.: OntoWiki – A Tool for Social, Semantic Collaboration. In: Cruz, I., Decker, S., Allemang, D., Preist, C., Schwabe, D., Mika, P., Uschold, M., Aroyo, L.M. (eds.) ISWC 2006. LNCS, vol. 4273, pp. 736–749. Springer, Heidelberg (2006)
Chapter Google Scholar
Brin, S.: Extracting Patterns and Relations from the World Wide Web. In: Atzeni, P., Mendelzon, A.O., Mecca, G. (eds.) WebDB 1998. LNCS, vol. 1590, pp. 172–183. Springer, Heidelberg (1999)
Chapter Google Scholar
Coates-Stephens, S.: The analysis and acquisition of proper names for the understanding of free text. Computers and the Humanities 26, 441–456 (1992) 10.1007/BF00136985
Article Google Scholar
Curran, J.R., Clark, S.: Language independent ner using a maximum entropy tagger. In: HLT-NAACL, pp. 164–167 (2003)
Google Scholar
Dietterich, T.G.: Ensemble Methods in Machine Learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 1–15. Springer, Heidelberg (2000)
Chapter Google Scholar
Etzioni, O., Cafarella, M., Downey, D., Popescu, A.-M., Shaked, T., Soderland, S., Weld, D.S., Yates, A.: Unsupervised named-entity extraction from the web: an experimental study. Artif. Intell. 165, 91–134 (2005)
Article Google Scholar
Finkel, J., Grenager, T., Manning, C.: Incorporating non-local information into information extraction systems by gibbs sampling. In: ACL, pp. 363–370 (2005)
Google Scholar
Frank, E., Paynter, G.W., Witten, I.H., Gutwin, C., Nevill-Manning, C.G.: Domain-specific keyphrase extraction. In: Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence, IJCAI 1999, pp. 668–673. Morgan Kaufmann Publishers Inc., San Francisco (1999)
Google Scholar
Grishman, R., Yangarber, R.: Nyu: Description of the Proteus/Pet system as used for MUC-7 ST. In: MUC-7. Morgan Kaufmann (1998)
Google Scholar
Harabagiu, S., Bejan, C.A., Morarescu, P.: Shallow semantics for relation extraction. In: IJCAI, pp. 1061–1066 (2005)
Google Scholar
Huynh, D., Mazzocchi, S., Karger, D.R.: Piggy Bank: Experience the Semantic Web Inside Your Web Browser. In: Gil, Y., Motta, E., Benjamins, V.R., Musen, M.A. (eds.) ISWC 2005. LNCS, vol. 3729, pp. 413–430. Springer, Heidelberg (2005)
Chapter Google Scholar
Kim, S.N., Kan, M.-Y.: Re-examining automatic keyphrase extraction approaches in scientific articles. In: MWE 2009, pp. 9–16 (2009)
Google Scholar
Kim, S.N., Medelyan, O., Kan, M.-Y., Baldwin, T.: Semeval-2010 task 5: Automatic keyphrase extraction from scientific articles. In: SemEval 2010, pp. 21–26. Association for Computational Linguistics, Stroudsburg (2010)
Google Scholar
Matsuo, Y., Ishizuka, M.: Keyword Extraction From A Single Document Using Word Co-Occurrence Statistical Information. International Journal on Artificial Intelligence Tools 13(1), 157–169 (2004)
Article Google Scholar
Nadeau, D.: Semi-Supervised Named Entity Recognition: Learning to Recognize 100 Entity Types with Little Supervision. PhD thesis, University of Ottawa (2007)
Google Scholar
Nadeau, D., Turney, P., Matwin, S.: Unsupervised Named-Entity Recognition: Generating Gazetteers and Resolving Ambiguity. In: Lamontagne, L., Marchand, M. (eds.) Canadian AI 2006. LNCS (LNAI), vol. 4013, pp. 266–277. Springer, Heidelberg (2006)
Chapter Google Scholar
Nguyen, D.P.T., Matsuo, Y., Ishizuka, M.: Relation extraction from wikipedia using subtree mining. In: AAAI, pp. 1414–1420 (2007)
Google Scholar
Nguyen, T.D., Kan, M.-Y.: Keyphrase Extraction in Scientific Publications. In: Goh, D.H.-L., Cao, T.H., Sølvberg, I.T., Rasmussen, E. (eds.) ICADL 2007. LNCS, vol. 4822, pp. 317–326. Springer, Heidelberg (2007)
Chapter Google Scholar
Pantel, P., Pennacchiotti, M.: Espresso: Leveraging generic patterns for automatically harvesting semantic relations. In: ACL, pp. 113–120 (2006)
Google Scholar
Park, Y., Byrd, R.J., Boguraev, B.K.: Automatic glossary extraction: beyond terminology identification. In: COLING, pp. 1–7 (2002)
Google Scholar
Pasca, M., Lin, D., Bigham, J., Lifchits, A., Jain, A.: Organizing and searching the world wide web of facts - step one: the one-million fact extraction challenge. In: Proceedings of the 21st National Conference on Artificial Intelligence, vol. 2, pp. 1400–1405. AAAI Press (2006)
Google Scholar
Ratinov, L., Roth, D.: Design challenges and misconceptions in named entity recognition. In: CONLL, pp. 147–155 (2009)
Google Scholar
Thielen, C.: An approach to proper name tagging for german. In: Proceedings of the EACL 1995 SIGDAT Workshop (1995)
Google Scholar
Tramp, S., Heino, N., Auer, S., Frischmuth, P.: RDFauthor: Employing RDFa for Collaborative Knowledge Engineering. In: Cimiano, P., Pinto, H.S. (eds.) EKAW 2010. LNCS, vol. 6317, pp. 90–104. Springer, Heidelberg (2010)
Chapter Google Scholar
Turney, P.D.: Coherent keyphrase extraction via web mining. In: IJCAI, San Francisco, CA, USA, pp. 434–439 (2003)
Google Scholar
Walker, D., Amsler, R.: The use of machine-readable dictionaries in sublanguage analysis. Analysing Language in Restricted Domains (1986)
Google Scholar
Wang, G., Yu, Y., Zhu, H.: PORE: Positive-Only Relation Extraction from Wikipedia Text. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L.J.B., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 580–594. Springer, Heidelberg (2007)
Chapter Google Scholar
Yan, Y., Okazaki, N., Matsuo, Y., Yang, Z., Ishizuka, M.: Unsupervised relation extraction by mining wikipedia texts using information from the web. In: ACL 2009, pp. 1021–1029 (2009)
Google Scholar
Zhou, G., Su, J.: Named entity recognition using an hmm-based chunk tagger. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ACL 2002, pp. 473–480. Association for Computational Linguistics, Morristown (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

AKSW Group, University of Leipzig, Johannisgasse 26, 04103, Leipzig, Germany
Axel-Cyrille Ngonga Ngomo, Norman Heino, Klaus Lyko & René Speck
Semantic Web Company, Lerchenfeldergürtel 43, A-1160, Vienna, Austria
Martin Kaltenböck

Authors

Axel-Cyrille Ngonga Ngomo
View author publications
You can also search for this author in PubMed Google Scholar
Norman Heino
View author publications
You can also search for this author in PubMed Google Scholar
Klaus Lyko
View author publications
You can also search for this author in PubMed Google Scholar
René Speck
View author publications
You can also search for this author in PubMed Google Scholar
Martin Kaltenböck
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science Dept., VU University Amsterdam, De Boelelaan 1081, 1081 HV, Amsterdam, The Netherlands
Lora Aroyo
IBM Research, NY 10598, Yorktown Heights, USA
Chris Welty
The Open University, Walton Hall, MK7 6AA, Milton Keynes, UK
Harith Alani
Google, USA
Jamie Taylor
University of Zurich, Binzmuehlestrasse 14, 8050, Zurich, Switzerland
Abraham Bernstein
Massachusetts Institute of Technology, 32 Vassar Street, MA 02139, Cambridge, USA
Lalana Kagal
Stanford University, 94305, Stanford, CA, USA
Natasha Noy
Linköping University, 581 83, Linköping, Sweden
Eva Blomqvist

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ngonga Ngomo, AC., Heino, N., Lyko, K., Speck, R., Kaltenböck, M. (2011). SCMS – Semantifying Content Management Systems. In: Aroyo, L., et al. The Semantic Web – ISWC 2011. ISWC 2011. Lecture Notes in Computer Science, vol 7032. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25093-4_13

Download citation

DOI: https://doi.org/10.1007/978-3-642-25093-4_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25092-7
Online ISBN: 978-3-642-25093-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics