Abstract
Ontologies have been used as a form of knowledge representation in different fields such as artificial intelligence, semantic web and natural language processing. The success caused by deep learning in recent years as a major upheaval in the field of artificial intelligence depends greatly on the data representation, since these representations can encode different types of hidden syntactic and semantic relationships in data, making their use very common in data science tasks. Ontologies do not escape this trend, applying deep learning techniques in the ontology-engineering field has heightened the need to learn and generate representations of the ontological data, which will allow ontologies to be exploited by such models and algorithms and thus automatizing different ontology-engineering tasks. This paper presents a novel approach for learning low dimensional continuous feature representations for ontology entities based on the semantic embedded in ontologies, using a multi-input feed-forward neural network trained using noise contrastive estimation technique. Semantically similar ontology entities will have relatively close corresponding representations in the projection space. Thus, the relationships between the ontology entities representations mirrors exactly the semantic relations between the corresponding entities in the source ontology.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Gruber, T.R.: A translation approach to portable ontology specifications. Knowl. Acquisit. 5(2), 199–220 (1993)
Gómez-Pérez, A., Corcho, O.: Ontology languages for the semantic web. IEEE Intell. Syst. 17(1), 54–60 (2002)
Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3(Feb), 1137–1155 (2003)
Schwenk, H., Dchelotte, D., Gauvain, J.-L.: Continuous space language models for statistical machine translation. In: Proceedings of the COLING/ACL on Main Conference Poster Sessions, pp. 723–730. Association for Computational Linguistics (2006)
Schwenk, H.: CSLM-a modular open-source continuous space language modeling toolkit. In: INTERSPEECH, pp. 1198–1202 (2013)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Mikolov, T., Yih, W., Zweig, G.: Linguistic regularities in continuous space word representations. In: Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 746–751 (2013)
Pennington, J., Socher, R., Manning, C.: GloVe: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)
Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)
Tenenbaum, J.B., De Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290(5500), 2319–2323 (2000)
Yan, S., Xu, D., Zhang, B., Zhang, H.-J.: Graph embedding: a general framework for dimensionality reduction. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 2, pp. 830–837. IEEE (2005)
Yan, S., Xu, D., Zhang, B., Zhang, H.-J., Yang, Q., Lin, S.: Graph embedding and extensions: a general framework for dimensionality reduction. IEEE Trans. Pattern Anal. Mach. Intell. 29(1), 40–51 (2007)
Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 855–864. ACM (2016)
Ristoski, P., Rosati, J., Di Noia, T., De Leone, R., Paulheim, H.: Rdf2Vec: RDF graph embeddings and their applications. Semant. Web (Preprint), 1–32 (2018)
Cochez, M., Ristoski, P., Ponzetto, S.P., Paulheim, H.: Global RDF vector space embeddings. In: d’Amato, C., et al. (eds.) ISWC 2017. LNCS, vol. 10587, pp. 190–207. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68288-4_12
Bakarov, A.: A survey of word embeddings evaluation methods. CoRR, abs/1801.09536 (2018)
Schnabel, T., Labutov, I., Mimno, D., Joachims, T.: Evaluation methods for unsupervised word embeddings. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 298–307 (2015)
Mnih, A., Kavukcuoglu, K.: Learning word embeddings efficiently with noise-contrastive estimation. In: Advances in Neural Information Processing Systems, pp. 2265–2273 (2013)
Dyer, C.: Notes on noise contrastive estimation and negative sampling. arXiv preprint arXiv:1410.8251 (2014)
Gutmann, M., Hyvärinen, A.: Noise-contrastive estimation: a new estimation principle for unnormalized statistical models. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 297–304 (2010)
Smith, B., et al.: The OBO foundry: coordinated evolution of ontologies to support biomedical data integration. Nat. Biotechnol. 25(11), 1251 (2007)
Whetzel, P.L., et al.: BioPortal: enhanced functionality via new web services from the national center for biomedical ontology to access and use ontologies in software applications. Nucleic Acids Res. 39(suppl\_2), W541–W545 (2011)
van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(Nov), 2579–2605 (2008)
Morales, J., et al.: A standardized framework for representation of ancestry data in genomics studies, with application to the NHGRI-EBI GWAS catalog. Genome Biology. 19(1), 21 (2018)
Acknowledgments
This paper is funded by the International Exchange Program of Harbin Engineering University for Innovation-oriented Talented Cultivation.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Benarab, A., Sun, J., Refoufi, A., Guan, J. (2019). Efficient Estimation of Ontology Entities Distributed Representations. In: Uden, L., Ting, IH., Corchado, J. (eds) Knowledge Management in Organizations. KMO 2019. Communications in Computer and Information Science, vol 1027. Springer, Cham. https://doi.org/10.1007/978-3-030-21451-7_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-21451-7_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-21450-0
Online ISBN: 978-3-030-21451-7
eBook Packages: Computer ScienceComputer Science (R0)