Beyond word embeddings: learning entity and concept representations from large scale knowledge bases
Text representations using neural word embeddings have proven effective in many NLP applications. Recent researches adapt the traditional word embedding models to learn vectors of multiword expressions (concepts/entities). However, these methods are limited to textual knowledge bases (e.g., Wikipedia). In this paper, we propose a novel and simple technique for integrating the knowledge about concepts from two large scale knowledge bases of different structure (Wikipedia and Probase) in order to learn concept representations. We adapt the efficient skip-gram model to seamlessly learn from the knowledge in Wikipedia text and Probase concept graph. We evaluate our concept embedding models on two tasks: (1) analogical reasoning, where we achieve a state-of-the-art performance of 91% on semantic analogies, (2) concept categorization, where we achieve a state-of-the-art performance on two benchmark datasets achieving categorization accuracy of 100% on one and 98% on the other. Additionally, we present a case study to evaluate our model on unsupervised argument type identification for neural semantic parsing. We demonstrate the competitive accuracy of our unsupervised method and its ability to better generalize to out of vocabulary entity mentions compared to the tedious and error prone methods which depend on gazetteers and regular expressions.
KeywordsEntity and concept embeddings Entity identification Concept categorization Skip-gram Probase Knowledge graph representations
This work was partially supported by the National Science Foundation under Grant Number 1624035. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. The authors would like to thank Avik Ray and Yilin Shen from Samsung Research America for their constructive feedback and discussions while developing the case study on the argument type identification task. The authors also appreciate the reviewers valuable and profound comments.
- Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., & Yakhnenko, O. (2013). Translating embeddings for modeling multi-relational data. In Advances in neural information processing systems (pp. 2787–2795).Google Scholar
- Cao, Y., Huang, L., Ji, H., Chen, X., & Li, J. (2017). Bridge text and knowledge by learning multi-prototype entity mention embedding. In Proceedings of the 55th annual meeting of the association for computational linguistics (volume 1: long papers) (Vol. 1, pp. 1623–1633).Google Scholar
- Dong, L., & Lapata, M. (2016). Language to logical form with neural attention. ArXiv preprint arXiv:160101280.
- Fang, W., Zhang, J., Wang, D., Chen, Z., & Li, M. (2016). Entity disambiguation by knowledge and text jointly embedding. In Proceedings of the 20th SIGNLL conference on computational natural language learning (pp. 260–269).Google Scholar
- Hu, Z., Huang, P., Deng, Y., Gao, Y., & Xing, E. P. (2015). Entity hierarchy embedding. In Proceedings of The 53rd annual meeting of the association for computational linguistics.Google Scholar
- Hua, W., Wang, Z., Wang, H., Zheng, K., & Zhou, X. (2015). Short text understanding through lexical-semantic analysis. In 2015 IEEE 31st international conference on data engineering (ICDE) (pp. 495–506).Google Scholar
- Iacobacci, I., Pilehvar, M. T., & Navigli, R. (2015). SensEmbed: Learning sense embeddings for word and relational similarity. In ACL (1) (pp. 95–105).Google Scholar
- Kim, D., Wang, H., & Oh, A. H. (2013). Context-dependent conceptualization. In IJCAI (pp. 2330–2336).Google Scholar
- Levy, O., & Goldberg, Y. (2014). Dependency-based word embeddings. In Proceedings of the 52nd annual meeting of the association for computational linguistics (volume 2: short papers) (Vol. 2, pp. 302–308).Google Scholar
- Li, Y., Zheng, R., Tian, T., Hu, Z., Iyer, R., & Sycara, K. (2016). Joint embedding of hierarchical categories and entities for concept categorization and dataless classification. ArXiv preprint arXiv:160707956.
- Mancini, M., Camacho-Collados, J., Iacobacci, I., Navigli, R. (2016). Embedding words and senses together via joint knowledge-enhanced training. ArXiv preprint arXiv:161202703.
- Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013a). Efficient estimation of word representations in vector space. ArXiv preprint arXiv:13013781.
- Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013b). Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems (pp. 3111–3119).Google Scholar
- Mikolov, T., Yih, W. T., & Zweig, G. (2013c). Linguistic regularities in continuous space word representations. In HLT-NAACL (Vol. 13, pp. 746–751).Google Scholar
- Phan, M. C., Sun, A., Tay, Y., Han, J., & Li, C. (2017). NeuPL: Attention-based semantic matching and pair-linking for entity disambiguation. In Proceedings of the 2017 ACM on conference on information and knowledge management (pp. 1667–1676).Google Scholar
- Ristoski, P., & Paulheim, H. (2016). RDF2Vec: RDF graph embeddings for data mining. In International semantic web conference (pp. 498–514). Springer.Google Scholar
- Rocchio, J. J. (1971). Relevance feedback in information retrieval. In The SMART retrieval system: Experiments in automatic document processing (pp. 313–323). Prentice-Hall Inc.Google Scholar
- Shalaby, W., & Zadrozny, W. (2017). Learning concept embeddings for efficient bag-of-concepts densification. ArXiv preprint arXiv:170203342.
- Song, Y., & Roth, D. (2015). Unsupervised sparse vector densification for short text similarity. In Proceedings of NAACL.Google Scholar
- Song, Y., Wang, H., Wang, Z., Li, H., & Chen, W. (2011). Short text conceptualization using a probabilistic knowledgebase. In Proceedings of the twenty-second international joint conference on artificial intelligence-volume volume three (pp. 2330–2336). AAAI PressGoogle Scholar
- Song, Y., Wang, S., & Wang, H. (2015). Open domain short text conceptualization: A generative+ descriptive modeling approach. In IJCAI (pp. 3820–3826).Google Scholar
- Wang, Y., Berant, J., Liang, P., et al. (2015a). Building a semantic parser overnight. In ACL (1) (pp. 1332–1342).Google Scholar
- Wang, Z., Zhang, J., Feng, J., & Chen, Z. (2014). Knowledge graph and text jointly embedding. EMNLP, 14, 1591–1601.Google Scholar
- Wang, Z., Zhao, K., Wang, H., Meng, X., & Wen, J. R. (2015b). Query understanding through knowledge-based conceptualization.Google Scholar
- Yamada, I., Shindo, H., Takeda, H., & Takefuji, Y. (2016). Joint learning of the embedding of words and entities for named entity disambiguation. ArXiv preprint arXiv:160101343.
- Zettlemoyer, L. S., & Collins, M. (2012). Learning to map sentences to logical form: Structured classification with probabilistic categorial grammars. ArXiv preprint arXiv:12071420.
- Zwicklbauer, S., Seifert, C., & Granitzer, M. (2016). Robust and collective entity disambiguation through semantic embeddings. In Proceedings of the 39th international ACM SIGIR conference on research and development in information retrieval (pp. 425–434).Google Scholar