Abstract
The paper presents an approach to extract knowledge from large text corpora, in particular knowledge that facilitates object manipulation by embodied intelligent systems that need to act in the world. As a first step, our goal is to extract the prototypical location of given objects from text corpora. We approach this task by calculating relatedness scores for objects and locations using techniques from distributional semantics. We empirically compare different methods for representing locations and objects as vectors in some geometric space, and we evaluate them with respect to a crowd-sourced gold standard in which human subjects had to rate the prototypicality of a location given an object. By applying the proposed framework on DBpedia, we are able to build a knowledge base of 931 high confidence object-locations relations in a fully automatic fashion (The work in this paper is partially funded by the ALOOF project (CHIST-ERA program)).
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
In the rest of the paper, the labels of the entities are identifiers from DBpedia URIs, stripped of the namespace http://dbpedia.org/resource/ for readability.
- 2.
- 3.
- 4.
http://www.opencyc.org/; as RDF representations: http://sw.opencyc.org/.
- 5.
Simple Knowledge Organization System: https://www.w3.org/2004/02/skos/.
- 6.
- 7.
The full automatically created knowledge base is available at http://project.inria.fr/aloof/files/2016/04/objectlocations.nt_.gz.
- 8.
All the datasets resulting from this work are available at https://project.inria.fr/aloof/data/.
References
Bach, N., Badaskar, S.: A Review of Relation Extraction (2007)
Barker, K., Agashe, B., Chaw, S.Y., Fan, J., Friedland, N., Glass, M., Hobbs, J., Hovy, E., Israel, D., Kim, D.S., Mulkar-Mehta, R., Patwardhan, S., Porter, B., Tecuci, D., Yeh, P.: Learning by reading: a prototype system, performance baseline and lessons learned. In: Proceedings of the 22nd National Conference on Artificial Intelligence, vol. 1, pp. 280–286. AAAI 2007 (2007)
Baroni, M., Dinu, G., Kruszewski, G.: Don’t count, predict! a systematic comparison of context-counting vs. context-predicting semantic vectors. In: Proceedings of ACL 2014 (vol. 1: Long Papers), June 2014
Blohm, S., Cimiano, P., Stemle, E.: Harvesting relations from the web - quantifiying the impact of filtering functions. In: Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence, pp. 1316–1321 (2007)
Bordes, A., Weston, J., Collobert, R., Bengio, Y.: Learning structured embeddings of knowledge bases. Artif. Intell. (Bengio), 301–306 (2011)
Bunescu, R.C., Mooney, R.J.: A shortest path dependency kernel for relation extraction. In: HLT/EMNLP. http://acl.ldc.upenn.edu/H/H05/H05-1091.pdf
Camacho-Collados, J., Pilehvar, M.T., Navigli, R.: HLT-NAACL (2015)
Cimiano, P., Wenderoth, J.: Automatically learning qualia structures from the web. In: Proceedings of the ACL-SIGLEX Workshop on Deep Lexical Acquisition. DeepLA 2005, pp. 28–37 (2005)
Ciobanu, A.M., Dinu, A.: Alternative measures of word relatedness in distributional semantics. In: Joint Symposium on Semantic Processing, p. 80 (2013)
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: NLP (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)
Daiber, J., Jakob, M., Hokamp, C., Mendes, P.N.: Improving efficiency and accuracy in multilingual entity extraction. In: Proceedings of I-Semantics (2013)
Etzioni, O.: Machine reading at web scale. In: Proceedings of the 2008 International Conference on Web Search and Data Mining, WSDM 2008, p. 2 (2008)
Etzioni, O., Fader, A., Christensen, J., Soderland, S., Mausam, M.: Open information extraction: the second generation. In: Proceedings of IJCAI, IJCAI 2011, vol. 1 (2011)
Faruqui, M., Dodge, J., Jauhar, S.K., Dyer, C., Hovy, E., Smith, N.A.: Retrofitting word vectors to semantic lexicons. In: Proceedings of NAACL (2015)
Girju, R., Badulescu, A., Moldovan, D.: Learning semantic constraints for the automatic discovery of part-whole relations. In: Proceedings of the NAACL 2003, vol. 1 (2003)
Harris, Z.: Distributional structure. Word 10(23), 146–162 (1954)
Hoffmann, R., Zhang, C., Ling, X., Zettlemoyer, L.S., Weld, D.S.: Knowledge-based weak supervision for information extraction of overlapping relations. In: Proceedings of ACL 2011, pp. 541–550 (2011)
Hoffmann, R., Zhang, C., Weld, D.S.: Learning 5000 relational extractors. In: Proceedings of ACL 2010, pp. 286–295 (2010)
Kiela, D., Hill, F., Clark, S.: Specializing word embeddings for similarity or relatedness. In: Proceedings of EMNLP 2015 (September), pp. 2044–2048 (2015)
Köhn, A.: What’s in an embedding? Analyzing word embeddings through multilingual evaluation. Proc. EMNLP 2015(2014), 2067–2073 (2015)
Landauer, T.K., Dutnais, S.T.: A solution to platos problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychol. Rev. 104(2), 211–240 (1997)
Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: ICML, pp. 1188–1196 (2014)
Liu, H., Singh, P.: Conceptnet— a practical commonsense reasoning tool-kit. BT Technol. J. 22(4), 211–226 (2004)
McAuley, J.J., Pandey, R., Leskovec, J.: Inferring networks of substitutable and complementary products. In: KDD (2015)
Mikolov, T., Corrado, G., Chen, K., Dean, J.: Efficient estimation of word representations in vector space. In: Proceedings of ICLR 2013 (2013)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Miller, G.A.: Wordnet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)
Mitchell, J., Lapata, M.: Vector-based models of semantic composition. In: Computational Linguistics (June), pp. 236–244
Mooney, R.J.: Learning to connect language and perception. In: Proceedings of the 23rd National Conference on Artificial Intelligence, AAAI 2008, vol. 3, pp. 1598–1601 (2008)
Navigli, R., Ponzetto, S.P.: BabelNet: the automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artif. Intell. 193, 217–250 (2012)
Radinsky, K., Agichtein, E., Gabrilovich, E., Markovitch, S.: A word at a time: computing word relatedness using temporal semantic analysis. In: Proceedings of WWW 2011, pp. 337–346. ACM (2011)
Reisinger, J., Mooney, R.J.: Multi-prototype vector-space models of word meaning. In: Proceedings of ACL 2010, pp. 109–117. Association for Computational Linguistics (2010)
Santos, C.D., Zadrozny, B.: Learning character-level representations for part-of-speech tagging. In: Proceedings of the 31st ICML, pp. 1818–1826 (2014)
Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: a core of semantic knowledge. In: Proceedings of WWW 2007, pp. 697–706. ACM, New York (2007)
Sun, Y., Lin, L., Tang, D., Yang, N., Ji, Z., Wang, X.: Modeling mention, context and entity with neural networks for entity disambiguation. In: IJCAI International Joint Conference on Artificial Intelligence, pp. 1333–1339 (2015)
Surdeanu, M., Tibshirani, J., Nallapati, R., Manning, C.D.: Multi-instance multi-label learning for relation extraction. In: Proceedings of EMNLP-CoNLL 2012, pp. 455–465 (2012)
Weston, J., Bordes, A., Yakhnenko, O., Usunier, N.: Connecting language and knowledge bases with embedding models for relation extraction. In: EMNLP, pp. 1366–1371
Xu, W., Hoffmann, R., Zhao, L., Grishman, R.: Filling knowledge base gaps for distant supervision of relation extraction. In: Proceedings of ACL 2013, vol. 2: Short Papers, pp. 665–670 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Basile, V., Jebbara, S., Cabrio, E., Cimiano, P. (2016). Populating a Knowledge Base with Object-Location Relations Using Distributional Semantics. In: Blomqvist, E., Ciancarini, P., Poggi, F., Vitali, F. (eds) Knowledge Engineering and Knowledge Management. EKAW 2016. Lecture Notes in Computer Science(), vol 10024. Springer, Cham. https://doi.org/10.1007/978-3-319-49004-5_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-49004-5_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-49003-8
Online ISBN: 978-3-319-49004-5
eBook Packages: Computer ScienceComputer Science (R0)