QaldGen: Towards Microbenchmarking of Question Answering Systems over Knowledge Graphs

  • Kuldeep SinghEmail author
  • Muhammad Saleem
  • Abhishek Nadgeri
  • Felix Conrads
  • Jeff Z. Pan
  • Axel-Cyrille Ngonga Ngomo
  • Jens Lehmann
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11779)


Over the last years, a number of Knowledge Graph (KG) based Question Answering (QA) systems have been developed. Consequently, the series of Question Answering Over Linked Data (QALD1–QALD9) challenges and other datasets have been proposed to evaluate these systems. However, the QA datasets contain a fixed number of natural language questions and do not allow users to select micro benchmarking samples of the questions tailored towards specific use-cases. We propose QaldGen, a framework for microbenchmarking of QA systems over KGs which is able to select customised question samples from existing QA datasets. The framework is flexible enough to select question samples of varying sizes and according to the user-defined criteria on the most important features to be considered for QA benchmarking. This is achieved using different clustering algorithms. We compare state-of-the-art QA systems over knowledge graphs by using different QA benchmarking samples. The observed results show that specialised micro-benchmarking is important to pinpoint the limitations of the various QA systems and its components.

Resource Type: Evaluation benchmarks or Methods


License: GNU General Public License v3.0



This work has been supported by the project LIMBO (Grant no. 19F2029I), OPAL (no. 19F2028A), KnowGraphs (no. 860801), and SOLIDE (no. 13N14456)


  1. 1.
    Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: DBpedia: a nucleus for a web of open data. In: Aberer, K., et al. (eds.) ASWC/ISWC -2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007). Scholar
  2. 2.
    Berant, J., Chou, A., Frostig, R., Liang, P.: Semantic parsing on freebase from question-answer pairs. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, EMNLP 2013, pp. 1533–1544. ACL (2013)Google Scholar
  3. 3.
    Bollacker, K.D., Evans, C., Paritosh, P., Sturge, T., Taylor, J.: Freebase: a collaboratively created graph database for structuring human knowledge. In: ACM SIGMOD, pp. 1247–1250 (2008)Google Scholar
  4. 4.
    Choi, K.-S., Mitamura, T., Vossen, P., Kim, J.-D., Ngomo, A.-C.N.: SIGIR 2017 workshop on open knowledge base and question answering (OKBQA 2017). In: Proceedings of the ACM SIGIR, pp. 1433–1434 (2017)Google Scholar
  5. 5.
    Derczynski, L., et al.: Analysis of named entity recognition and linking for tweets. Inf. Process. Manag. 51(2), 32–49 (2015)CrossRefGoogle Scholar
  6. 6.
    Diefenbach, D., Both, A., Singh, K., Maret, P.: Towards a question answering system over the semantic web. arXiv preprint arXiv:1803.00832 (2018)
  7. 7.
    Dubey, M., Banerjee, D., Chaudhuri, D., Lehmann, J.: EARL: joint entity and relation linking for question answering over knowledge graphs. In: Vrandečić, D., et al. (eds.) ISWC 2018. LNCS, vol. 11136, pp. 108–126. Springer, Cham (2018). Scholar
  8. 8.
    Ferragina, P., Scaiella, U.: TAGME: on-the-fly annotation of short text fragments (by Wikipedia entities). In: Proceedings of the 19th ACM Conference on Information and Knowledge Management, CIKM 2010, Toronto, Ontario, Canada, 26–30 October 2010, pp. 1625–1628. ACM (2010)Google Scholar
  9. 9.
    Höffner, K., Walter, S., Marx, E., Usbeck, R., Lehmann, J., Ngomo, A.N.: Survey on challenges of question answering in the semantic web. Semant. Web 8(6), 895–920 (2017)CrossRefGoogle Scholar
  10. 10.
    Huang, X., Zhang, J., Li, D., Li, P.: Knowledge graph embedding based question answering. In: Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, pp. 105–113. ACM (2019)Google Scholar
  11. 11.
    Li, F., Jagadish, H.V.: Constructing an interactive natural language interface for relational databases. PVLDB 8(1), 73–84 (2014)Google Scholar
  12. 12.
    Loni, B.: A survey of state-of-the-art methods on question classification (2011)Google Scholar
  13. 13.
    Maheshwari, G., Trivedi, P., Lukovnikov, D., Chakraborty, N., Fischer, A., Lehmann, J.: Learning to rank query graphs for complex question answering over knowledge graphs. arXiv preprint arXiv:1811.01118 (2018)
  14. 14.
    Mazzeo, G.M., Zaniolo, C.: Answering controlled natural language questions on RDF knowledge bases. In: EDBT, pp. 608–611 (2016)Google Scholar
  15. 15.
    Mendes, P.N., Jakob, M., García-Silva, A., Bizer, C.: DBpedia spotlight: shedding light on the web of documents. In: Proceedings the 7th International Conference on Semantic Systems, I-SEMANTICS 2011, Graz, Austria, 7–9 September 2011, pp. 1–8. ACM (2011)Google Scholar
  16. 16.
    Moldovan, D., Paşca, M., Harabagiu, S., Surdeanu, M.: Performance issues and error analysis in an open-domain question answering system. ACM Trans. Inf. Syst. (TOIS) 21(2), 133–154 (2003)CrossRefGoogle Scholar
  17. 17.
    Ngomo, N.: 9th challenge on question answering over linked data (QALD-9). Language 7:1Google Scholar
  18. 18.
    Sakor, A., et al.: Old is gold: linguistic driven approach for entity and relation linking of short text. In: NAACL 2019. ACL (2019, to appear)Google Scholar
  19. 19.
    Saleem, M., Ali, M.I., Hogan, A., Mehmood, Q., Ngomo, A.-C.N.: LSQ: the linked SPARQL queries dataset. In: Arenas, M., et al. (eds.) ISWC 2015. LNCS, vol. 9367, pp. 261–269. Springer, Cham (2015). Scholar
  20. 20.
    Saleem, M., Dastjerdi, S.N., Usbeck, R., Ngomo, A.N.: Question answering over linked data: what is difficult to answer? What affects the F scores? In: Joint Proceedings of BLINK 2017: Co-Located with (ISWC 2017), Austria (2017)Google Scholar
  21. 21.
    Saleem, M., Mehmood, Q., Ngonga Ngomo, A.-C.: FEASIBLE: a feature-based SPARQL benchmark generation framework. In: Arenas, M., et al. (eds.) ISWC 2015. LNCS, vol. 9366, pp. 52–69. Springer, Cham (2015). Scholar
  22. 22.
    Saleem, M., Stadler, C., Mehmood, Q., Lehmann, J., Ngomo, A.-C.N.: SQCFramework: SPARQL query containment benchmark generation framework. In: Proceedings of the Knowledge Capture Conference, p. 28. ACM (2017)Google Scholar
  23. 23.
    Shekarpour, S., Marx, E., Ngomo, A.N., Auer, S.: SINA: semantic interpretation of user queries for question answering on interlinked data. J. Web Sem. 30, 39–51 (2015)CrossRefGoogle Scholar
  24. 24.
    Singh, K.: Towards dynamic composition of question answering pipelines. Ph.D. thesis, University of Bonn, Germany (2019)Google Scholar
  25. 25.
    Singh, K., Both, A., Sethupat, A., Shekarpour, S.: Frankenstein: a platform enabling reuse of question answering components. In: Gangemi, A., et al. (eds.) ESWC 2018. LNCS, vol. 10843, pp. 624–638. Springer, Cham (2018). Scholar
  26. 26.
    Singh, K., Lytra, I., Radhakrishna, A.S., Shekarpour, S., Vidal, M.-E., Lehmann, J.: No one is perfect: Analysing the performance of question answering components over the DBpedia knowledge graph. arXiv:1809.10044 (2018)
  27. 27.
    Singh, K., et al.: Why reinvent the wheel: let’s build question answering systems together. In: Web Conference, pp. 1247–1256 (2018)Google Scholar
  28. 28.
    Trivedi, P., Maheshwari, G., Dubey, M., Lehmann, J.: LC-QuAD: a corpus for complex question answering over knowledge graphs. In: d’Amato, C., et al. (eds.) ISWC 2017. LNCS, vol. 10588, pp. 210–218. Springer, Cham (2017). Scholar
  29. 29.
    Unger, C., et al.: Question answering over linked data (QALD-5). In: Working Notes of CLEF 2015 - Conference and Labs of the Evaluation forum, Toulouse, France, 8–11 September 2015. (2015)Google Scholar
  30. 30.
    Usbeck, R., Hoffmann, M., Röder, M., Lehmann, J., Ngomo, A.N.: Using multi-label classification for improved question answering. CoRR (2017)Google Scholar
  31. 31.
    Usbeck, R., et al.: Benchmarking question answering systems. Semant. Web J. (2019)Google Scholar
  32. 32.
    Usbeck, R., et al.: GERBIL: general entity annotator benchmarking framework. In: WWW 2015, pp. 1133–1143 (2015)Google Scholar
  33. 33.
    Voorhees, E.M., Harman, D.K. (eds.): Proceedings of The Eighth Text REtrieval Conference, TREC 1999, Gaithersburg, Maryland, USA, 17–19 November 1999, volume Special Publication, 500-246. National Institute of Standards and Technology (NIST) (1999)Google Scholar
  34. 34.
    Vrandecic, D.: Wikidata: a new platform for collaborative data collection. In: Proceedings of the 21st World Wide Web Conference, WWW 2012, Lyon, France, 16–20 April 2012 (Companion Volume), pp. 1063–1064. ACM (2012)Google Scholar
  35. 35.
    Waitelonis, J., Jürges, H., Sack, H.: Remixing entity linking evaluation datasets for focused benchmarking. Semant. Web 10(2), 385–412 (2019)CrossRefGoogle Scholar
  36. 36.
    Zafar, H., Napolitano, G., Lehmann, J.: Formal query generation for question answering over knowledge bases. In: Gangemi, A., et al. (eds.) ESWC 2018. LNCS, vol. 10843, pp. 714–728. Springer, Cham (2018). Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Kuldeep Singh
    • 1
    Email author
  • Muhammad Saleem
    • 2
  • Abhishek Nadgeri
    • 3
  • Felix Conrads
    • 2
  • Jeff Z. Pan
    • 5
    • 6
  • Axel-Cyrille Ngonga Ngomo
    • 4
  • Jens Lehmann
    • 7
  1. 1.Nuance Communications Deutschland GmbHAachenGermany
  2. 2.University of LeipzigLeipzigGermany
  3. 3.Service Lee TechnologiesMumbaiIndia
  4. 4.University of PaderbornPaderbornGermany
  5. 5.Edinburgh Research Centre, HuaweiEdinburghUK
  6. 6.University of AberdeenAberdeenUK
  7. 7.Fraunhofer IAISSankt AugustinGermany

Personalised recommendations