Automatic Web-Based Question Answer Generation System for Online Feedable New-Born Chatbot

  • Sameera A. Abdul-KaderEmail author
  • John Woods
  • Thabat Thabet
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 858)


The knowledge bases of Chatbots are built manually, which is difficult and time consuming to create and maintain. The idea of automatically building a Chatbot knowledge base from the web has emerged in recent years. Question Answer (QA) pairs are acquired from existing online forums. Little work has been done on generating questions from existing fact or fictional sentences. Two main contributions are presented in this paper. The first contribution is generating factual questions from sentences gathered by a web spider; the raw text sentences are extracted from the HTML and pre-processed. Named Entity (proper none) Recognition (NER) is used in addition to verb tense recognition in order to identify the factual sentence category. Specific rules are built to categorize the sentences and then to generate questions based upon them. The second contribution is to generate a new born Chatbot database by placing the resultant QA pairs into an SQLite database built for this purpose. The new built database is used to nurture a Chatbot that can simulate the personality of a desired figure or behavior of an object. The footballer David Beckham is used as an example and the data used is acquired from a page on about him on Wikipedia. The resulting QA pairs are presented and a subjective assessment shows considerable enhancement in QA pairs’ generation over a comparative system.


Chatbot knowledge Feature extraction Information retrieval Named entity recognition Natural language processing Question answer pairs 


  1. 1.
    Huang, J., Zhou, M., Yang, D.: Extracting chatbot knowledge from online discussion forums, pp. 423–428 (2007)Google Scholar
  2. 2.
    Wu, Y., Wang, G., Li, W., Li, Z.: Automatic chatbot knowledge acquisition from online forum via rough set and ensemble learning, pp. 242–246 (2008)Google Scholar
  3. 3.
    Haller, E., Rebedea, T.: Designing a chatbot that simulates an historical figure, pp. 582–589 (2013)Google Scholar
  4. 4.
    Cong, G., Wang, L., Lin, C.-Y., Song, Y.-I., Sun, Y.: Finding question-answer pairs from online forums, pp. 467–474 (2008)Google Scholar
  5. 5.
    Wu, F., Weld, D.S.: Open information extraction using Wikipedia, pp. 118–127 (2010)Google Scholar
  6. 6.
    Matsuyama, Y., Saito, A., Fujie, S., Kobayashi, T.: Automatic expressive opinion sentence generation for enjoyable conversational systems. IEEE/ACM Trans. Audio Speech Lang. Process. 23(2), 313–326 (2015)CrossRefGoogle Scholar
  7. 7.
    Huang, Y., He, L.: Automatic generation of short answer questions for reading comprehension assessment. Nat. Lang. Eng. 22(03), 457–489 (2016)CrossRefGoogle Scholar
  8. 8.
    Zhang, L., VanLehn, K.: How do machine-generated questions compare to human-generated questions? Res. Pract. Technol. Enhanc. Learn. 11(1), 1 (2016)CrossRefGoogle Scholar
  9. 9.
    Lindberg, D., Popowich, F., Nesbit, J., Winne, P.: Generating natural language questions to support learning on-line. In: ENLG 2013, p. 105 (2013)Google Scholar
  10. 10.
    Abdul-Kader, S.A., Woods, J.: Survey on chatbot design techniques in speech conversation systems. IJACSA 6(7), 72–80 (2015)Google Scholar
  11. 11.
    Peñas, A.R., Sama, V., Verdejo, F.: Testing the reasoning for question answering validation. J. Logic Comput. 18(3), 459–474 (2008)MathSciNetCrossRefGoogle Scholar
  12. 12.
    Ferret, O., Grau, B., Hurault-Plantet, M., Illouz, G., Jacquemin, C., Monceaux, L., Robba, I., Vilnat, A.: How NLP can improve question answering. Knowl. Organ. 29(3/4), 135–155 (2002)Google Scholar
  13. 13.
    Chinchor, N., Robinson, P.: MUC-7 named entity task definition, p. 29 (1997)Google Scholar
  14. 14.
    Almuhaimeed, A.: Enhancing Recommendations in Specialist Search Through Semantic-Based Techniques and Multiple Resources. University of Essex (2016)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Sameera A. Abdul-Kader
    • 1
    • 2
    Email author
  • John Woods
    • 1
  • Thabat Thabet
    • 1
    • 3
  1. 1.School of Computer Science and Electronic EngineeringUniversity of EssexColchesterUK
  2. 2.University of DiyalaDiyalaIraq
  3. 3.Technical CollageMosulIraq

Personalised recommendations