Abstract
Lexical annotation is the process of explicit assignment of one or more meanings to a term w.r.t. a sense inventory (e.g., a thesaurus or an ontology). We propose an automatic supervised lexical annotation method, called ALATK (Automatic Lexical Annotation -Topic Kernel), based on the Topic Kernel function for the annotation of schema labels extracted from structured and semi-structured data sources. It exploits Wikipedia as sense inventory and as resource of training data.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
The research leading to this work was partially supported by the Biogest-Siteia projects http://www.biogest-siteia.unimore.it, funded by Emilia-Romagna (Italy) regional government. Our sincere thanks to Professor Sanda Harabagiu, and to the PhD students Bryan Rink and Kirk Roberts for their support to this research.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bergamaschi, S., Po, L., Sorrentino, S.: Automatic Annotation in Data Integration Systems. In: Meersman, R., Tari, Z. (eds.) OTM-WS 2007, Part I. LNCS, vol. 4805, pp. 27–28. Springer, Heidelberg (2007)
Bizer, C., Lehmann, J., Kobilarov, G., Auer, S., Becker, C., Cyganiak, R., Hellmann, S.: Dbpedia - a crystallization point for the web of data. J. Web Sem. 7(3), 154–165 (2009)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. Journal of Machine Learning Research 3, 993–1022 (2003)
Boyd-Graber, J.L., Blei, D.M., Zhu, X.: A topic model for word sense disambiguation. In: EMNLP-CoNLL, pp. 1024–1033 (2007)
Cai, J., Lee, W.S., Teh, Y.W.: Improving word sense disambiguation using topic features. In: EMNLP-CoNLL, pp. 1015–1023 (2007)
Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge University Press (2010)
Beneventano, D., Bergamaschi, S., Guerra, F., Vincini, M.: Synthesizing an integrated ontology. IEEE Internet Computing 7(5), 42–51 (2003)
Gliozzo, A.M., Giuliano, C., Strapparava, C.: Domain kernels for word sense disambiguation. In: The Association for Computer Linguistics, ACL (2005)
Griffiths, T., Steyvers, M., Tenenbaum, J.: Topics in semantic representation. Psychological Review 114(2), 211–244 (2007)
Hepp, M., Siorpaes, K., Bachlechner, D.: Harvesting wiki consensus: Using wikipedia entries as vocabulary for knowledge management. IEEE Internet Computing 11(5), 54–65 (2007)
Hoffart, J., Suchanek, F.M., Berberich, K., Lewis-Kelham, E., de Melo, G., Weikum, G.: Yago2: exploring and querying world knowledge in time, space, context, and many languages. In: WWW (Companion Volume), pp. 229–232 (2011)
Landauer, T.K., Foltz, P.W., Laham, D.: An introduction to latent semantic analysis. Discourse Processes 25(2), 259–284 (1998)
Medelyan, O., Milne, D., Legg, C., Witten, I.H.: Mining meaning from wikipedia. Int. J. Hum.-Comput. Stud. 67, 716–754 (2009)
Mihalcea, R.: Using wikipedia for automatic word sense disambiguation. In: HLT-NAACL, pp. 196–203 (2007)
Miller, A.: Wordnet: A lexical database for english. Communications of the ACM 38(11), 39–41 (1995)
Navigli, R.: Word sense disambiguation: A survey. ACM Comput. Surv. 41(2) (2009)
Sorrentino, S., Bergamaschi, S., Gawinecki, M., Po, L.: Schema label normalization for improving schema matching. Data Knowl. Eng. 69(12), 1254–1273 (2010)
Wong, S.K.M., Ziarko, W., Wong, P.C.N.: Generalized vector space model in information retrieval. In: SIGIR, pp. 18–25 (1985)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sorrentino, S., Bergamaschi, S., Parmiggiani, E. (2012). A Supervised Method for Lexical Annotation of Schema Labels Based on Wikipedia. In: Atzeni, P., Cheung, D., Ram, S. (eds) Conceptual Modeling. ER 2012. Lecture Notes in Computer Science, vol 7532. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34002-4_28
Download citation
DOI: https://doi.org/10.1007/978-3-642-34002-4_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34001-7
Online ISBN: 978-3-642-34002-4
eBook Packages: Computer ScienceComputer Science (R0)