QaldGen: Towards Microbenchmarking of Question Answering Systems over Knowledge Graphs

Singh, Kuldeep; Saleem, Muhammad; Nadgeri, Abhishek; Conrads, Felix; Pan, Jeff Z.; Ngomo, Axel-Cyrille Ngonga; Lehmann, Jens

doi:10.1007/978-3-030-30796-7_18

Kuldeep Singh¹⁷,
Muhammad Saleem¹⁸,
Abhishek Nadgeri¹⁹,
Felix Conrads¹⁸,
Jeff Z. Pan^21,22,
Axel-Cyrille Ngonga Ngomo²⁰ &
…
Jens Lehmann²³

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11779))

Included in the following conference series:

International Semantic Web Conference

3321 Accesses
5 Citations

Abstract

Over the last years, a number of Knowledge Graph (KG) based Question Answering (QA) systems have been developed. Consequently, the series of Question Answering Over Linked Data (QALD1–QALD9) challenges and other datasets have been proposed to evaluate these systems. However, the QA datasets contain a fixed number of natural language questions and do not allow users to select micro benchmarking samples of the questions tailored towards specific use-cases. We propose QaldGen, a framework for microbenchmarking of QA systems over KGs which is able to select customised question samples from existing QA datasets. The framework is flexible enough to select question samples of varying sizes and according to the user-defined criteria on the most important features to be considered for QA benchmarking. This is achieved using different clustering algorithms. We compare state-of-the-art QA systems over knowledge graphs by using different QA benchmarking samples. The observed results show that specialised micro-benchmarking is important to pinpoint the limitations of the various QA systems and its components.

Resource Type: Evaluation benchmarks or Methods

Repository: https://github.com/dice-group/qald-generator

License: GNU General Public License v3.0

K. Singh and M. Saleem—These two authors contributed equally as first author.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://qa.mpi-inf.mpg.de/comqa/.
2.
Prefix dbr is bound to http://dbpedia.org/resource/.
3.
http://gerbil-qa.aksw.org/gerbil/experiment?id=201903190000.
4.
As reported by [27] and [7].
5.
https://github.com/dice-group/NLIWOD.
6.
http://frankenstein.qanary-qa.com.
7.
https://github.com/dice-group/gerbil/wiki/Precision,-Recall-and-F1-measure.
8.
https://github.com/dice-group/gerbil/wiki/QALD-Generation.
9.
https://research.fb.com/downloads/babi/.
10.
http://qald.aksw.org/index.php?x=home&q=1.

References

Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: DBpedia: a nucleus for a web of open data. In: Aberer, K., et al. (eds.) ASWC/ISWC -2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-76298-0_52
Chapter Google Scholar
Berant, J., Chou, A., Frostig, R., Liang, P.: Semantic parsing on freebase from question-answer pairs. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, EMNLP 2013, pp. 1533–1544. ACL (2013)
Google Scholar
Bollacker, K.D., Evans, C., Paritosh, P., Sturge, T., Taylor, J.: Freebase: a collaboratively created graph database for structuring human knowledge. In: ACM SIGMOD, pp. 1247–1250 (2008)
Google Scholar
Choi, K.-S., Mitamura, T., Vossen, P., Kim, J.-D., Ngomo, A.-C.N.: SIGIR 2017 workshop on open knowledge base and question answering (OKBQA 2017). In: Proceedings of the ACM SIGIR, pp. 1433–1434 (2017)
Google Scholar
Derczynski, L., et al.: Analysis of named entity recognition and linking for tweets. Inf. Process. Manag. 51(2), 32–49 (2015)
Article Google Scholar
Diefenbach, D., Both, A., Singh, K., Maret, P.: Towards a question answering system over the semantic web. arXiv preprint arXiv:1803.00832 (2018)
Dubey, M., Banerjee, D., Chaudhuri, D., Lehmann, J.: EARL: joint entity and relation linking for question answering over knowledge graphs. In: Vrandečić, D., et al. (eds.) ISWC 2018. LNCS, vol. 11136, pp. 108–126. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00671-6_7
Chapter Google Scholar
Ferragina, P., Scaiella, U.: TAGME: on-the-fly annotation of short text fragments (by Wikipedia entities). In: Proceedings of the 19th ACM Conference on Information and Knowledge Management, CIKM 2010, Toronto, Ontario, Canada, 26–30 October 2010, pp. 1625–1628. ACM (2010)
Google Scholar
Höffner, K., Walter, S., Marx, E., Usbeck, R., Lehmann, J., Ngomo, A.N.: Survey on challenges of question answering in the semantic web. Semant. Web 8(6), 895–920 (2017)
Article Google Scholar
Huang, X., Zhang, J., Li, D., Li, P.: Knowledge graph embedding based question answering. In: Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, pp. 105–113. ACM (2019)
Google Scholar
Li, F., Jagadish, H.V.: Constructing an interactive natural language interface for relational databases. PVLDB 8(1), 73–84 (2014)
Google Scholar
Loni, B.: A survey of state-of-the-art methods on question classification (2011)
Google Scholar
Maheshwari, G., Trivedi, P., Lukovnikov, D., Chakraborty, N., Fischer, A., Lehmann, J.: Learning to rank query graphs for complex question answering over knowledge graphs. arXiv preprint arXiv:1811.01118 (2018)
Mazzeo, G.M., Zaniolo, C.: Answering controlled natural language questions on RDF knowledge bases. In: EDBT, pp. 608–611 (2016)
Google Scholar
Mendes, P.N., Jakob, M., García-Silva, A., Bizer, C.: DBpedia spotlight: shedding light on the web of documents. In: Proceedings the 7th International Conference on Semantic Systems, I-SEMANTICS 2011, Graz, Austria, 7–9 September 2011, pp. 1–8. ACM (2011)
Google Scholar
Moldovan, D., Paşca, M., Harabagiu, S., Surdeanu, M.: Performance issues and error analysis in an open-domain question answering system. ACM Trans. Inf. Syst. (TOIS) 21(2), 133–154 (2003)
Article Google Scholar
Ngomo, N.: 9th challenge on question answering over linked data (QALD-9). Language 7:1
Google Scholar
Sakor, A., et al.: Old is gold: linguistic driven approach for entity and relation linking of short text. In: NAACL 2019. ACL (2019, to appear)
Google Scholar
Saleem, M., Ali, M.I., Hogan, A., Mehmood, Q., Ngomo, A.-C.N.: LSQ: the linked SPARQL queries dataset. In: Arenas, M., et al. (eds.) ISWC 2015. LNCS, vol. 9367, pp. 261–269. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25010-6_15
Chapter Google Scholar
Saleem, M., Dastjerdi, S.N., Usbeck, R., Ngomo, A.N.: Question answering over linked data: what is difficult to answer? What affects the F scores? In: Joint Proceedings of BLINK 2017: Co-Located with (ISWC 2017), Austria (2017)
Google Scholar
Saleem, M., Mehmood, Q., Ngonga Ngomo, A.-C.: FEASIBLE: a feature-based SPARQL benchmark generation framework. In: Arenas, M., et al. (eds.) ISWC 2015. LNCS, vol. 9366, pp. 52–69. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25007-6_4
Chapter Google Scholar
Saleem, M., Stadler, C., Mehmood, Q., Lehmann, J., Ngomo, A.-C.N.: SQCFramework: SPARQL query containment benchmark generation framework. In: Proceedings of the Knowledge Capture Conference, p. 28. ACM (2017)
Google Scholar
Shekarpour, S., Marx, E., Ngomo, A.N., Auer, S.: SINA: semantic interpretation of user queries for question answering on interlinked data. J. Web Sem. 30, 39–51 (2015)
Article Google Scholar
Singh, K.: Towards dynamic composition of question answering pipelines. Ph.D. thesis, University of Bonn, Germany (2019)
Google Scholar
Singh, K., Both, A., Sethupat, A., Shekarpour, S.: Frankenstein: a platform enabling reuse of question answering components. In: Gangemi, A., et al. (eds.) ESWC 2018. LNCS, vol. 10843, pp. 624–638. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93417-4_40
Chapter Google Scholar
Singh, K., Lytra, I., Radhakrishna, A.S., Shekarpour, S., Vidal, M.-E., Lehmann, J.: No one is perfect: Analysing the performance of question answering components over the DBpedia knowledge graph. arXiv:1809.10044 (2018)
Singh, K., et al.: Why reinvent the wheel: let’s build question answering systems together. In: Web Conference, pp. 1247–1256 (2018)
Google Scholar
Trivedi, P., Maheshwari, G., Dubey, M., Lehmann, J.: LC-QuAD: a corpus for complex question answering over knowledge graphs. In: d’Amato, C., et al. (eds.) ISWC 2017. LNCS, vol. 10588, pp. 210–218. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68204-4_22
Chapter Google Scholar
Unger, C., et al.: Question answering over linked data (QALD-5). In: Working Notes of CLEF 2015 - Conference and Labs of the Evaluation forum, Toulouse, France, 8–11 September 2015. CEUR-WS.org (2015)
Google Scholar
Usbeck, R., Hoffmann, M., Röder, M., Lehmann, J., Ngomo, A.N.: Using multi-label classification for improved question answering. CoRR (2017)
Google Scholar
Usbeck, R., et al.: Benchmarking question answering systems. Semant. Web J. (2019)
Google Scholar
Usbeck, R., et al.: GERBIL: general entity annotator benchmarking framework. In: WWW 2015, pp. 1133–1143 (2015)
Google Scholar
Voorhees, E.M., Harman, D.K. (eds.): Proceedings of The Eighth Text REtrieval Conference, TREC 1999, Gaithersburg, Maryland, USA, 17–19 November 1999, volume Special Publication, 500-246. National Institute of Standards and Technology (NIST) (1999)
Google Scholar
Vrandecic, D.: Wikidata: a new platform for collaborative data collection. In: Proceedings of the 21st World Wide Web Conference, WWW 2012, Lyon, France, 16–20 April 2012 (Companion Volume), pp. 1063–1064. ACM (2012)
Google Scholar
Waitelonis, J., Jürges, H., Sack, H.: Remixing entity linking evaluation datasets for focused benchmarking. Semant. Web 10(2), 385–412 (2019)
Article Google Scholar
Zafar, H., Napolitano, G., Lehmann, J.: Formal query generation for question answering over knowledge bases. In: Gangemi, A., et al. (eds.) ESWC 2018. LNCS, vol. 10843, pp. 714–728. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93417-4_46
Chapter Google Scholar

Download references

Acknowledgments

This work has been supported by the project LIMBO (Grant no. 19F2029I), OPAL (no. 19F2028A), KnowGraphs (no. 860801), and SOLIDE (no. 13N14456)

Author information

Authors and Affiliations

Nuance Communications Deutschland GmbH, Aachen, Germany
Kuldeep Singh
University of Leipzig, Leipzig, Germany
Muhammad Saleem & Felix Conrads
Service Lee Technologies, Mumbai, India
Abhishek Nadgeri
University of Paderborn, Paderborn, Germany
Axel-Cyrille Ngonga Ngomo
Edinburgh Research Centre, Huawei, Edinburgh, UK
Jeff Z. Pan
University of Aberdeen, Aberdeen, UK
Jeff Z. Pan
Fraunhofer IAIS, Sankt Augustin, Germany
Jens Lehmann

Authors

Kuldeep Singh
View author publications
You can also search for this author in PubMed Google Scholar
Muhammad Saleem
View author publications
You can also search for this author in PubMed Google Scholar
Abhishek Nadgeri
View author publications
You can also search for this author in PubMed Google Scholar
Felix Conrads
View author publications
You can also search for this author in PubMed Google Scholar
Jeff Z. Pan
View author publications
You can also search for this author in PubMed Google Scholar
Axel-Cyrille Ngonga Ngomo
View author publications
You can also search for this author in PubMed Google Scholar
Jens Lehmann
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kuldeep Singh .

Editor information

Editors and Affiliations

Fondazione Bruno Kessler, Trento, Italy
Chiara Ghidini
Linköping University, Linköping, Sweden
Olaf Hartig
University of Bonn, Bonn, Germany
Maria Maleshkova
University of Economics Prague, Prague, Czech Republic
Vojtěch Svátek
University of Illinois at Chicago, Chicago, IL, USA
Isabel Cruz
University of Chile, Santiago, Chile
Aidan Hogan
Memect Technology, Beijing, China
Jie Song
Mines Saint-Etienne, Saint-Etienne, France
Maxime Lefrançois
Inria Sophia Antipolis - Méditerranée, Sophia Antipolis, France
Fabien Gandon

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Singh, K. et al. (2019). QaldGen: Towards Microbenchmarking of Question Answering Systems over Knowledge Graphs. In: Ghidini, C., et al. The Semantic Web – ISWC 2019. ISWC 2019. Lecture Notes in Computer Science(), vol 11779. Springer, Cham. https://doi.org/10.1007/978-3-030-30796-7_18

Download citation

DOI: https://doi.org/10.1007/978-3-030-30796-7_18
Published: 17 October 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30795-0
Online ISBN: 978-3-030-30796-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the Semantic Web Science Association (opens in a new tab)