Multilevel Entity-Informed Business Relation Extraction

Khaldi, Hadjer; Benamara, Farah; Abdaoui, Amine; Aussenac-Gilles, Nathalie; Kang, EunBee

doi:10.1007/978-3-030-80599-9_10

Hadjer Khaldi ORCID: orcid.org/0000-0001-9502-9145^12,13,
Farah Benamara¹³,
Amine Abdaoui¹²,
Nathalie Aussenac-Gilles¹³ &
…
EunBee Kang¹²

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12801))

Included in the following conference series:

International Conference on Applications of Natural Language to Information Systems

1633 Accesses

Abstract

This paper describes a business relation extraction system that combines contextualized language models with multiple levels of entity knowledge. Our contributions are three-folds: (1) a novel characterization of business relations, (2) the first large English dataset of more than 10k relation instances manually annotated according to this characterization, and (3) multiple neural architectures based on BERT, newly augmented with three complementary levels of knowledge about entities: generalization over entity type, pre-trained entity embeddings learned from two external knowledge graphs, and an entity-knowledge-aware attention mechanism. Our results show an improvement over many strong knowledge-agnostic and knowledge-enhanced state of the art models for relation extraction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://github.com/Geotrend-research/business-relation-dataset.
2.
We consider textual contents from various sources and formats excluding those retrieved from social media, e-commerce, and code versioning websites.
3.
The set of keywords have been chosen by business intelligence experts.
4.
https://isahit.com/en/.
5.
https://github.com/facebookresearch/BLINK.
6.
https://huggingface.co/bert-base-cased.
7.
All the hyperparameters were tuned on a validation set (10% of the train set).
8.
Among existing entity-informed models (cf. Sect. 2), at the time of performing these experiments, and as far as we know, only KnowBert and ERNIE were actually available to the research community. In this paper, we compare with Knowbert as it achieved the best results on the TACRED dataset (71.50% on F1-score) when compared to ERNIE (67.97%) [25].
9.
We also experimented with Entity-Attention-BiLSTM following [10] but the results were not conclusive.

References

Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: DBpedia: a nucleus for a web of open data. In: Aberer, K., et al. (eds.) ASWC/ISWC -2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-76298-0_52
Chapter Google Scholar
Braun, D., Faber, A., Hernandez-Mendez, A., Matthes, F.: Automatic relation extraction for building smart city ecosystems using dependency parsing. In: Proceedings of NL4AI@ AI* IA, pp. 29–39. CEUR-WS.org (2018)
Google Scholar
Camacho-Collados, J., Pilehvar, M.T., Navigli, R.: Nasari: integrating explicit knowledge and corpus statistics for a multilingual representation of concepts and entities. Artif. Intell. 240, 36–64 (2016)
Article MathSciNet Google Scholar
Collovini, S., Gonçalves, P.N., Cavalheiro, G., Santos, J., Vieira, R.: Relation extraction for competitive intelligence. In: Quaresma, P., Vieira, R., Aluísio, S., Moniz, H., Batista, F., Gonçalves, T. (eds.) PROPOR 2020. LNCS (LNAI), vol. 12037, pp. 249–258. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-41505-1_24
Chapter Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL-HLT, no. 1 (2019)
Google Scholar
Gupta, P., Rajaram, S., Schütze, H., Runkler, T.: Neural relation extraction within and across sentence boundaries. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 6513–6520 (2019)
Google Scholar
Hendrickx, I., et al.: SemEval-2010 task 8: Multi-way classification of semantic relations between pairs of nominals. In: Proceedings of the 5th International Workshop on Semantic Evaluation, pp. 33–38. ACL (2010)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Lau, R., Zhang, W.: Semi-supervised statistical inference for business entities extraction and business relations discovery. In: Proceedings of SIGIR Workshop, pp. 41–46 (2011)
Google Scholar
Lee, J., Seo, S., Choi, Y.S.: Semantic relation classification via bidirectional LSTM networks with entity-aware attention using latent entity typing. Symmetry 11(6), 785 (2019)
Article Google Scholar
Li, B.Z., Min, S., Iyer, S., Mehdad, Y., Yih, W.T.: Efficient one-pass end-to-end entity linking for questions. In: Proceedings of EMNLP, pp. 6433–6441 (2020)
Google Scholar
Li, J., Huang, G., Chen, J., Wang, Y.: Dual CNN for relation extraction with knowledge-based attention and word embeddings. Comput. Intell. Neurosci. 2019, 1–10 (2019)
Google Scholar
Li, Z., Lian, Y., Ma, X., Zhang, X., Li, C.: Bio-semantic relation extraction with attention-based external knowledge reinforcement. BMC Bioinform 21, 1–18 (2020)
Article Google Scholar
Martinez-Rodriguez, J.L., Hogan, A., Lopez-Arevalo, I.: Information extraction meets the semantic web: a survey. In: Semantic Web, pp. 1–81 (2020)
Google Scholar
Mikolov, T., Grave, É., Bojanowski, P., Puhrsch, C., Joulin, A.: Advances in pre-training distributed word representations. In: Proceedings of LREC (2018)
Google Scholar
Miller, G.A., Beckwith, R., Fellbaum, C., Gross, D., Miller, K.J.: Introduction to wordnet: an on-line lexical database. Int. J. Lexicography 3(4), 235–244 (1990)
Article Google Scholar
Mitchell, A., Strassel, S., Huang, S., Zakhary, R.: Ace 2004 Multilingual Training Corpus, p. 1. Linguistic Data Consortium, Philadelphia pp (2005)
Google Scholar
Navigli, R., Ponzetto, S.P.: Babelnet: the automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artif. Intell. 193, 217–250 (2012)
Article MathSciNet Google Scholar
Pawar, S., Palshikar, G.K., Bhattacharyya, P.: Relation extraction: a survey. arXiv preprint arXiv:1712.05191 (2017)
Peters, M.E., et al.: Knowledge enhanced contextual word representations. In: Proceedings of EMNLP-IJCNLP, pp. 43–54 (2019)
Google Scholar
Poerner, N., Waltinger, U., Schütze, H.: E-BERT: efficient-yet-effective entity embeddings for BERT. In: EMNLP, pp. 803–818. ACL (2020)
Google Scholar
Sewlal, R.: Effectiveness of the web as a competitive intelligence tool. South African J. Inf. Manage. 6(1), 1–16 (2004)
Google Scholar
Shi, P., Lin, J.: Simple bert models for relation extraction and semantic role labeling. arXiv preprint arXiv:1904.05255 (2019)
Soares, L.B., FitzGerald, N., Ling, J., Kwiatkowski, T.: Matching the blanks: distributional similarity for relation learning. In: Proceedings of ACL, pp. 2895–2905 (2019)
Google Scholar
Wang, R., et al.: K-adapter: Infusing knowledge into pre-trained models with adapters. arXiv preprint arXiv:2002.01808 (2020)
Wei, Q., et al.: Relation extraction from clinical narratives using pre-trained language models. In: AMIA Annual Symposium Proceedings, vol. 2019, p. 1236. American Medical Informatics Association (2019)
Google Scholar
Wiegand, M., Roth, B., Lasarcyk, E., Köser, S., Klakow, D.: A gold standard for relation extraction in the food domain. In: Proceedings of LREC (2012)
Google Scholar
Wu, S., He, Y.: Enriching pre-trained language model with entity information for relation classification. In: Proceedings of ACM CIKM 2019, pp. 2361–2364 (2019)
Google Scholar
Yadav, S., Ramesh, S., Saha, S., Ekbal, A.: Relation extraction from biomedical and clinical text: unified multitask learning framework. IEEE/ACM Trans. Comput. Biol. Bioinform. (2020)
Google Scholar
Yamada, I., et al.: Wikipedia2Vec: an efficient toolkit for learning and visualizing the embeddings of words and entities from Wikipedia. In: Proceedings of EMNLP: System Demonstrations, pp. 23–30 (2020)
Google Scholar
Yamada, I., Shindo, H., Takeda, H., Takefuji, Y.: Joint learning of the embedding of words and entities for named entity disambiguation. In: Proceedings of The 20th SIGNLL CoNLL, pp. 250–259 (2016)
Google Scholar
Yamamoto, A., Miyamura, Y., Nakata, K., Okamoto, M.: Company relation extraction from web news articles for analyzing industry structure. In: 2017 IEEE ICSC, pp. 89–92 (2017)
Google Scholar
Yan, C., Fu, X., Wu, W., Lu, S., Wu, J.: Neural network based relation extraction of enterprises in credit risk management. In: 2019 IEEE BigComp, pp. 1–6 (2019)
Google Scholar
Ye, W., Li, B., Xie, R., Sheng, Z., Chen, L., Zhang, S.: Exploiting entity BIO tag embeddings and multi-task learning for relation extraction with imbalanced data. In: Proceedings of ACL, pp. 1351–1360 (2019)
Google Scholar
Zeng, D., Liu, K., Lai, S., Zhou, G., Zhao, J.: Relation classification via convolutional deep neural network. In: Proceedings of COLING: Technical Papers, pp. 2335–2344. ACL, Dublin City University (2014)
Google Scholar
Zhang, Y., Zhong, V., Chen, D., Angeli, G., Manning, C.D.: Position-aware attention and supervised data improve slot filling. In: Proceedings of EMNLP, pp. 35–45. ACL (2017)
Google Scholar
Zhao, J., Jin, P., Liu, Y.: Business relations in the web: semantics and a case study. J. Softw. 5(8), 826–833 (2010)
Article Google Scholar
Zhou, P., et al.: Attention-based bidirectional long short-term memory networks for relation classification. In: Proceedings of ACL (Volume 2: Short Papers), pp. 207–212. ACL (2016)
Google Scholar
Zuo, Z., Loster, M., Krestel, R., Naumann, F.: Uncovering business relationships: Context-sensitive relationship extraction for difficult relationship types. In: Proceedings of LWDA (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

Geotrend, Toulouse, France
Hadjer Khaldi, Amine Abdaoui & EunBee Kang
IRIT-CNRS, Toulouse, France
Hadjer Khaldi, Farah Benamara & Nathalie Aussenac-Gilles

Authors

Hadjer Khaldi
View author publications
You can also search for this author in PubMed Google Scholar
Farah Benamara
View author publications
You can also search for this author in PubMed Google Scholar
Amine Abdaoui
View author publications
You can also search for this author in PubMed Google Scholar
Nathalie Aussenac-Gilles
View author publications
You can also search for this author in PubMed Google Scholar
EunBee Kang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hadjer Khaldi .

Editor information

Editors and Affiliations

Conservatoire National des Arts et Métiers, Paris, France
Elisabeth Métais
University of Derby, Derby, UK
Farid Meziane
German Research Center for Artificial Intelligence, Saarbrücken, Germany
Helmut Horacek
University of Hertfordshire, Hatfield, UK
Epaminondas Kapetanios

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Khaldi, H., Benamara, F., Abdaoui, A., Aussenac-Gilles, N., Kang, E. (2021). Multilevel Entity-Informed Business Relation Extraction. In: Métais, E., Meziane, F., Horacek, H., Kapetanios, E. (eds) Natural Language Processing and Information Systems. NLDB 2021. Lecture Notes in Computer Science(), vol 12801. Springer, Cham. https://doi.org/10.1007/978-3-030-80599-9_10

Download citation

DOI: https://doi.org/10.1007/978-3-030-80599-9_10
Published: 20 June 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-80598-2
Online ISBN: 978-3-030-80599-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics