Abstract
In the paper we present a novel method of wordnets’ data integration. The proposed method is based on the XML representation of wordnets content. In particular, we focus on the integration of VisDic-based documents representing the data of two Polish wordnets, i.e. plWordNet and Polnet. One of the key features of the method is that it is able to automatically identify and handle the discrepancies existing in the structure of the integrated documents. Apart from the method itself, we briefly discuss a C#-based implementation of the method. Finally, we present some statistical measures related to the data available before and after the integration process. The statistical comparison allows us to determine, among other things, the impact of particular wordnets on the integrated set of data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Amaro, R., Mendes, S.: Towards merging common and technical lexicon wordnets. In: Proceedings of the 3rd Workshop on Cognitive Aspects of the Lexicon (CogALex-III): 24th International Conference on Computational Linguistics, COLING 2012, pp. 147–160 (2012)
Arfaoui, N., Akaichi, J.: Automating schema integration technique case study: generating data warehouse schema from data mart schemas. In: Kozielski, S., Mrozek, D., Kasprowski, P., Małysiak-Mrozek, B., Kostrzewa, D. (eds.) BDAS 2015. CCIS, vol. 521, pp. 200–209. Springer, Cham (2015). doi:10.1007/978-3-319-18422-7_18
Bach, M., Kozielski, S., Świderski, M.: Zastosowanie ontologii do opisu semantyki relacyjnej bazy danych na potrzeby analizy zapytań w języku naturalnym. Studia Informatica 30(2A(83)), 187–199 (2009). Presented at BDAS 2009
Biemann, C.: Ontology learning from text: a survey of methods. LDV Forum 20(2), 75–93 (2005)
Cupek, R., Ziebinski, A., Fojcik, M.: An ontology model for communicating with an autonomous mobile platform. In: Kozielski, S., Kasprowski, P., Mrozek, D., Małysiak-Mrozek, B., Kostrzewa, D. (eds.) BDAS 2017. CCIS, vol. 716, pp. 480–493. Springer, Cham (2017)
Euzenat, J., Schvaiko, P.: Ontology Matching. Springer, Heidelberg (2013)
Goczyła, K., Zawadzka, T.: Zależności między ontologiami i ich wpływ na problem integracji ontologii. In: Kozielski, S., Małysiak, B., Kasprowski, P., Mrozek, D. (eds.) Bazy Danych: Struktury, Algorytmy, Metody, pp. 331–340. WKŁ, Warsaw (2006)
Hajnicz, E.: Automatyczne tworzenie semantycznych słowników walencyjnych. Akademicka Oficyna Wydawnicza EXIT, Warsaw (2011)
Horák, A., Smrž, P.: VisDic - wordnet browsing and editing tool. In: Sojka, P., Pala, K., Smrž, P., Fellbaum, C., Vossen, P. (eds.) Proceedings of the 2nd International WordNet Conference, pp. 136–141 (2003)
Horák, A., Smrž, P.: New features of wordnet editor VisDic. Rom. J. Inf. Sci. Technol. 7(1–2), 1–13 (2004)
Hossain, J., Sani, F., Affendey, L.S., Ishak, I., Kasmiran, K.A.: Semantic schema matching approaches: a review. J. Theor. Appl. Inf. Technol. 62(1), 139–147 (2014)
Ibrahim, H., Karasneh, Y., Mirabi, M., Yaakob, R., Othman, M.: An automatic domain independent schema matching in integrating schemas of heterogeneous relational databases. J. Inf. Sci. Eng. 30, 1505–1536 (2014)
Jastrząb, T., Kwiatkowski, G., Sadowski, P.: Mapping of selected synsets to semantic features. In: Kozielski, S., Mrozek, D., Kasprowski, P., Małysiak-Mrozek, B., Kostrzewa, D. (eds.) BDAS 2015-2016. CCIS, vol. 613, pp. 357–367. Springer, Cham (2016). doi:10.1007/978-3-319-34099-9_28
Kwak, J., Yong, H.S.: Ontology matching based on hypernym, hyponym, holonym, and meronym sets in wordnet. Int. J. Semant. Technol. 1(2), 1–14 (2010)
Lawrence, R., Barker, K.: Integrating relational database schemas using a standardized dictionary. In: Proceedings of the 2001 ACM Symposium on Applied Computing (SAC 2001), pp. 225–230. ACM (2001)
Magnini, B., Speranza, M.: Integrating generic and specialized wordnets. In: Proceedings of the 2nd Conference on Recent Advances in Natural Language Processing (RANLP 2001) (2001)
Mahdi, A.M., Tiun, S.: Utilizing wordnet for instance-based schema matching. In: Proceedings of the International Conference on Advances in Computer Science and Electronics Engineering (CSEE 2014), pp. 59–63. Institute of Research Engineers and Doctors (2014)
Maziarz, M., Piasecki, M., Rudnicka, E., Szpakowicz, S., Kędzia, P.: plWordNet 3.0 - a comprehensive lexical-semantic resource. In: Proceedings of the 26th International Conference on Computational Linguistics: Technical Papers, COLING 2016, pp. 2259–2268 (2016)
Miller, G.A.: Nouns in wordnet: a lexical inheritance system. Int. J. Lexicogr. 3(4), 245–264 (1990)
Miller, G.A., Beckwith, R., Fellbaum, C., Gross, D., Miller, K.: Introduction to wordnet: an on-line lexical database. Int. J. Lexicogr. 3(4), 235–244 (1990)
Mykowiecka, A.: Inżynieria lingwistyczna: komputerowe przetwarzanie tekstów w jȩzyku naturalnym. Wydawnictwo PJWSTK, Warsaw (2007)
Piasecki, M., Szpakowicz, S., Broda, B.: Toward plWordNet 2.0. In: Bhattacharyya, P., Fellbaum, C., Vossen, P. (eds.) Proceedings of the 5th Global Wordnet Conference on Principles, Construction and Application of Multilingual Wordnets, pp. 263–270. Narosa Publishing House (2010)
Rahm, E., Bernstein, P.: A survey of approaches to automatic schema matching. VLDB J. 10, 334–350 (2001)
Świderski, M.: Metodologia LAV w systemie semantycznej integracji geoprzestrzennych źródeł danych. In: Kozielski, S., Małysiak, B., Kasprowski, P., Mrozek, D. (eds.) Bazy Danych: Modele, Technologie, Narzędzia, pp. 213–220. WKŁ, Warsaw (2005)
Vetulani, Z.: Komunikacja człowieka z maszyną. Akademicka Oficyna Wydawnicza EXIT, Warsaw (2014)
Vetulani, Z., Vetulani, G., Kochanowski, B.: Recent advances in development of a lexicon-grammar of polish: Polnet 3.0. In: Calzolari, N., Choukri, K., et al. (eds.) Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016), pp. 2851–2854. European Language Resources Association (ELRA) (2016)
Xiang, C., Jiang, T., Chang, B., Sui, Z.: ERSOM: A structural ontology matching approach using automatically learned entity representation. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 2419–2429 (2015)
Ziebinski, A., Cupek, R., Erdogan, H., Waechter, S.: A survey of ADAS technologies for the future perspective of sensor fusion. In: Nguyen, N.-T., Manolopoulos, Y., Iliadis, L., Trawiński, B. (eds.) ICCCI 2016. LNCS (LNAI), vol. 9876, pp. 135–146. Springer, Cham (2016). doi:10.1007/978-3-319-45246-3_13
Acknowledgments
The reported study was partially supported by the European Union from the FP7-PEOPLE-2013-IAPP AutoUniMo project Automotive Production Engineering Unified Perspective based on Data Mining Methods and Virtual Factory Model (grant agreement no. 612207) and research work financed from funds for science in years 2016–2017 allocated to an international co-financed project (grant agreement no: 3491/7.PR/15/2016/2). It was also partially supported by Institute of Informatics research grant no. BKM/507/RAU2/2016.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Krasnokucki, D., Kwiatkowski, G., Jastrząb, T. (2017). A New Method of XML-Based Wordnets’ Data Integration. In: Kozielski, S., Mrozek, D., Kasprowski, P., Małysiak-Mrozek, B., Kostrzewa, D. (eds) Beyond Databases, Architectures and Structures. Towards Efficient Solutions for Data Analysis and Knowledge Representation. BDAS 2017. Communications in Computer and Information Science, vol 716. Springer, Cham. https://doi.org/10.1007/978-3-319-58274-0_25
Download citation
DOI: https://doi.org/10.1007/978-3-319-58274-0_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-58273-3
Online ISBN: 978-3-319-58274-0
eBook Packages: Computer ScienceComputer Science (R0)