Towards portable natural language interfaces based on case-based reasoning

Moreo, A.; Castro, J. L.; Zurita, J. M.

doi:10.1007/s10844-017-0453-8

Towards portable natural language interfaces based on case-based reasoning

Published: 27 February 2017

Volume 49, pages 281–314, (2017)
Cite this article

Journal of Intelligent Information Systems Aims and scope Submit manuscript

A. Moreo¹,
J. L. Castro¹ &
J. M. Zurita¹

453 Accesses
1 Citation
Explore all metrics

Abstract

Natural Language Interfaces allow non-technical people to access information stored in Knowledge Bases keeping them unaware of the particular structure of the model or the underlying formal query language. Early research in the field was devoted to improve the performance of a particular system for a given Knowledge Base. Since adapting the system to new domains usually entailed considerable effort, investigating how to bring Portability to NLI became a new challenge. In this article, we investigate how Case-Based Reasoning could serve to assist the expert in porting the system so as to improve its retrieval performance. Our method HITS is based on a novel grammar learning algorithm combined with language acquisition techniques that exploit structural analogies. The learner (system) is able to engage the teacher (expert) with clarification dialogues to validate conjectures (hypotheses and deductions) about the language. Our method presents the following advantages: (i) the customization is naturally defined in the case-based cycle, (ii) the types of questions the system can deal with are not delimited in advance, and (iii) the system ‘reasons’ about precedent cases to deal with unseen questions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Toward a knowledge-to-text controlled natural language of isiZulu

Article 04 February 2016

Bridging the Gap Between Formal Languages and Natural Languages with Zippers

Knowledge Authoring for Rule-Based Reasoning

Notes

http://www.w3.org/standards/semanticweb/ontology
A question is considered to be tractable in Precise if none of its tokens is ambiguous in the Knowledge Base.
Just for the simplicity purpose, these examples are shown as a human-like dialogue. Formal representations will be later exposed.
Note that even if the answer is not present in the database, interpreting the question and informing accordingly is still valuable for the user who is now prevented from trying to reformulate the same query again and again in an unfruitful and frustrating way
http://msdn.microsoft.com/en-us/library/aa198281(v=sql.80).aspx
http://start.csail.mit.edu/
Our method is not restricted to any particular DB or DBMS. In any case, it may be worth mentioning that it was validated on MySQL
http://decsai.ugr.es/moreo/publico/NLIDB_dataSets/Datasets_NLIDB.html
ErelatedToV is the QM devoted to retrieve all elements related to another element specified by a value. For example “show all rivers in Alaska”.
In the first case, the lexicon contains entries (Book.Author, ’author’) and (Book.Author, ’write’). In the second case, the Lexicon contains entries (Author, ’author’), (Author, ’who’), and (Wrote, ’write’). (Recall the stemming process unifies ’write’, ’writes’, ’wrote’, etc.)
EmostRelated retrieves the element in an entity that is most related to another entity type. For example “which river traverses most states?”
Shallow parsing or shallow analysisusually stands for light parsers based on the identification of constituents, usually using named-entity techniques, and regardless of further syntactic or semantic considerations. They are considered to be less reliable but faster than other techniques such as formal grammar parsers.
In this case, p _i acts as a pivot. When more than one pivots are found, the one that produces the highest number of alignments is taken. I.e. join(a b c a, b d a), pivots first on b instead of on a
Even if the improvement is quite limited in this example, note that the coverage could be significantly benefited after various iterations.
We have limited the search to a maximum of two simultaneous hypotheses.
Deletions do not actually represent a problem, since interpretations regarding deleted data will operate properly, returning no data.
We used the exalead (http://www.exalead.com/search/) search engine to implement this method. In contrast to Turney’s notation (NEAR), this operator is called NEXT in exalead. For example: http://www.exalead.com/search/web/results/?q=Microsoft+NEXT+Company
Geobase presents a rich structure to test the addition of new Query Models and to interpret compound questions. Jobdata test questions contains several unseen terms to test the hypotheses. Finally, Restbase presents a relatively simple structure, but questions show a considerable grammatical variability that helped us to test the productions refinement method.
http://decsai.ugr.es/moreo/publico/NLIDB_dataSets/Datasets_NLIDB.html
AoperationE resolves queries requesting for the elements of an entity that have some values satisfying certain operation. For example “what is the longest river”, where operation Greatest is applied to the numerical attribute Length in entity River
An Intel(R) Core(TM)2 Quad Q8200 2.33GHz with 6GBytes RAM was used to carry out the tests.
http://msdn.microsoft.com/en-us/library/aa198281(v=sql.80).aspx
Results for Microsoft English Query were taken from the experimental validation reported in Popescu et al. (2003)

References

Aamodt, A., & Plaza, E. (1994). Case-based reasoning: Foundational issues, methodological variations, and system approaches. AI communications, 7(1), 39–59.
Google Scholar
Acorn, T.L., & Walden, S.H. (1992). Smart: Support management automated reasoning technology for compaq customer service. In Proceedings of the fourth conference on Innovative applications of artificial intelligence (pp. 3–18): AAAI Press.
Adriaans, P., & Vervoort, M. (2002). The emile 4.1 grammar induction toolbox. In Grammatical Inference: Algorithms and Applications, volume 2484 of Lecture Notes in Artificial Intelligence (pp. 293–295).
Aha, D.W., Breslow, L.A., & Muñoz-Avila, H. (2001). Conversational case-based reasoning. Applied Intelligence, 14(1), 9–32.
Article MATH Google Scholar
Aha, D.W., McSherry, D., & Yang, Q. (2005). Advances in conversational case-based reasoning. The knowledge engineering review, 20(03), 247–254.
Article Google Scholar
Alshawi, H., Carter, D., Crouch, R., & Pulman, S. (1994). Clare: A contextual reasoning and cooperative response framework for the core language engine Technical report crc-028.
Androutsopoulus, I., Ritchie, G., & Thanish, P. (1993). Masque/sql, an efficient and portable natural language query interface for relational databases. In Proceedings 6th International Conference on Industrial & Engineering Applications of Artificial Intelligence and Expert Systems, pages 327–330, Edinburgh, UK.
Androutsopoulos, I., Ritchie, G.D., & Thanish, P. (1995). Natural language interfaces to databases - an introduction. Natual Language Engineering, 1(1), 29–81.
Google Scholar
Angluin, D. (1987). Learning regular sets from queries and counterexamples. Information and Computation, 75(2), 87–106.
Article MathSciNet MATH Google Scholar
Bernstein, A., Kaufmann, E., & Kaiser, C. (2005). Querying the semantic web with ginseng A guided input natural language search engine. In 15th Workshop on Information Technologies and Systems, Las Vegas, NV (pp. 112–126).
Carrick, C., Yang, Q., Abi-Zeid, I., & Lamontagne, L. (1999). Activating cbr systems through autonomous information gathering. In International Conference on Case-Based Reasoning (pp. 74–88): Springer.
Chu, W., Yang, H., Chiang, K., Minock, M., Chow, G., & Larson, C. (1996). Cobase: A scalable and extensible cooperative information system. Journal of Intelligent Information Systems, 6, 223– 259.
Article Google Scholar
Cimiano, P., Haase, P., & Heizmann, J. (2007). Porting natural language interfaces between domains: an experimental user study with the orakel system. In Proceedings of the 12th international conference on Intelligent user interfaces, IUI ’07, pages 180–189, New York, NY, USA: ACM.
Cordier, A., Fuchs, B., Lieber, J., & Mille, A. (2007). Interactive knowledge acquisition in case based reasoning. In Wilson, D.C., & Khemani, D. (Eds.) Workshop on Knowledge Discovery and Similarity, a workshop of the seventh International Conference on Case-Based Reasoning (ICCBR-07): (volume editors).
Cullingford, E.R. (1978). Script application: Computer understanding of newspaper stories. Technical report, DTIC Document.
Cullot, N., Ghawi, R., & Kokou, Y. (2007). DB2OWL : A Tool For Automatic Database-to-Ontology Mapping. In SEBD (pp. 491–494).
Damljanović, D., & Bontcheva, K. (2009). Towards enhanced usability of natural language interfaces to knowledge bases. In Web 2.0 and Semantic Web, volume 6 of Annals of Information Systems (pp. 105–133). US: Springer.
Damljanovic, D., Agatonovic, M., & Cunningham, H. (2012). Freya: An interactive way of querying linked data using natural language. In Proceedings of the 8th international conference on The Semantic Web, ESWC’11, (pp. 125–138). Berlin, Heidelberg: Springer-Verlag.
Ferrucci, D., Levas, A., Bagchi, S., Gondek, D., & Mueller, E.T. (2013). Watson: beyond jeopardy!. Artificial Intelligence, 199, 93–105.
Article Google Scholar
Frank, A., Krieger, H.-U., Feiyu, X., Uszkoreit, H., Crysmann, B., Jȯrg, B., & Schȧfer, U. (2007). Question answering from structured knowledge sources. Journal of Applied Logic, 5(1), 20–48.
Article Google Scholar
Gold, M.E. (1967). Language identification in the limit. Information and Control, 10(5), 447–474.
Article MathSciNet MATH Google Scholar
Grosz, B.J., Appelt, D.E., Martin, P.A., & Pereira, F.C.N. (1987). Team: an experiment in the design of transportable natural-language interfaces. Artificial Intelligence, 32(2), 173–243.
Article Google Scholar
Hallett, C., Power, R., & Scott, D. (2007). Composing questions through conceptual authoring. Computational Linguistics, 33, 105–133.
Article Google Scholar
Harris, Z.S. (1951). Structural linguistics. University of Chicago Press, chicago:IL, USA and London, UK 7th (1966) edition.
Hendrix, G., Sacerdoti, E., Sagalowicz, D., & Slocum, J. (1978). Developing a natural language interface to complex data. ACM Transactions on Database Systems, 3(2), 105–147.
Article Google Scholar
Jones, K.S. (2005). Some points in a time. Computational Linguistics, 31(1), 1–14.
Article Google Scholar
Kaplan, S.J. (1984). Designing a portable natural language database query system. ACM Transactions on Database Systems, 9, 1–19.
Article Google Scholar
Kate, R.J., & Mooney, R.J. (2006). Using string-kernels for learning semantic parsers. In Proceedings of the 44th Annual Meeting of the Association for Computational Linguistics.
Kate, R.J., Wong, Y.W., & Mooney, R.J. (2005). Learning to transform natural to formal languages. In Proceedings of the National Conference on Artificial Intelligence.
Kaufmann, E., & Bernstein, A. (2007). How useful are natural language interfaces to the semantic web for casual end-users?. In Proceedings of the 6th international The semantic web and 2nd Asian conference on Asian semantic web conference, ISWC’07/ASWC’07 (pp. 281–294). Berlin, Heidelberg: Springer-Verlag.
Kaufmann, E., Bernstein, A., & Zumstein, R. (2006). Querix: A natural language interface to query ontologies based on clarification dialogs. In 5th ISWC (pp. 980–981): Springer.
Kaufmann, E., Bernstein, A., & Fischer, Lorenz (2007). Abraham Bernstein, and Lorenz Fischer NLP-reduce: A naive but Domain-independent Natural Language Interface for Querying Ontologies.
Kittredge, R. (1982). Variation and homogeneity of sublanguages. Sublanguage: Studies of Language in Restricted Semantic Domains, pages 107–137.
Kolodner, J.L. (1983). Reconstructive memory: A computer model. Cognitive science, 7(4), 281–328.
Article Google Scholar
Kolodner, J. (2014). Case-based reasoning Morgan Kaufmann.
Leake, D.B., Kinley, A., & Wilson, D. (1996). Acquiring case adaptation knowledge: A hybrid approach. In Proceedings of the 13th National Conference on Artificial Intelligence, Menlo Park, CA (pp. 684–689): AAAI Press.
Lopez, V., Nikolov, A., Sabou, M., Uren, V., Motta, E., & D’Aquin, M. (2010). Scaling up question-answering to linked data. In Proceedings of the 17th international conference on Knowledge engineering and management by the masses, EKAW’10 (pp. 193–210). Berlin, Heidelberg: Springer-Verlag.
Lopez, V., Uren, V., Sabou, M., & Motta, E. (2011). Is question answering fit for the semantic web?: a survey. Semant. web, 2(2), 125–155.
Google Scholar
Lu, W., Ng, H.T. , Lee, W.S., & Zettlemoyer, L.S. (2008). A generative model for parsing natural language to meaning representations. In The Conference on Empirical Methods in Natural Language Processing.
McCord, M.C. (1990). Slot grammar: a system for simpler construction of practical natural language grammars. Technical report rc15582(d69261) IBM.
McCrae, J., & Spohr, D. (2011). Linking lexical resources and Ontologies on the semantic web with lemon. In Proceedings of the 8th extended semantic web conference on The semantic web: research and applications - Volume Part I, ESWC’11 (pp. 245–259). Berlin, Heidelberg: Springer-Verlag.
McSherry, D. (2014). An algorithm for conversational case-based reasoning in classification tasks. In International Conference on Case-Based Reasoning (pp. 289–304): Springer.
Miller, G., Beckwith, R., Fellbaum, C., Gross, D., & Miller, K. (1990). Introduction to wordnet: an on-line lexical database. International Journal of Lexicography (special issue), 3(4), 235–312.
Article Google Scholar
Moreo, A., Navarro, M., Castro, J.L., & Zurita, J.M. (2012). A high-performance faq retrieval method using minimal differentiator expressions. Knowledge-Based Systems, 36(0), 9–20.
Article Google Scholar
Moreo, A., Eisman, E.M., Castro, J.L., & Zurita, J.M. (2013). Learning regular expressions to template-based faq retrieval systems. Knowledge-Based Systems, 53, 108–128.
Article Google Scholar
Ogden, W., Mcdonald, J. , Bernick, P., & Chadwick, R. (2006). Habitability in question-answering systems. In Advances in Open Domain Question Answering, volume 32 of Text, Speech and Language Technology (pp. 457–473). Netherlands: Springer.
Ott, N. (1992). Aspects of the automatic generation of sql statements in a natural language query interface. Information Systems, (2):147–159.
Owei, V. (2000). Natural language querying of databases: an information extraction approach in the conceptual query language. International Journal of Human - Computer Studies, 53, 439–492.
Article Google Scholar
Pazos, R.A., Pérez, J., González, J.J., Gelbukh, A., Sidorov, G., & Rodríguez, M.J. (2005). A domain independent natural language interface to databases capable of processing complex queries. In MICAI 2005 (pp. 833–842).
Popescu, A.M. , Etzioni, O., & Kautz, H. (2003). Towards a theory of natural language interfaces to databases. In 8th Intl. Conf. on Intelligent User Interfaces, pages 149–157, Miami, FL.
Rodolfo, A., Pazos, R., Juan, J., González, B., Marco, A., Aguirre, L., José, A., Martínez, F. , Héctor, J., & Fraire, H. (2013). Natural language interfaces to databases: An analysis of the state of the art . In Recent Advances on Hybrid Intelligent Systems, volume 451 of Studies in Computational Intelligence (pp. 463–480). Berlin, Heidelberg: Springer.
Sakakibara, Y. (1990). Learning context-free grammars from structural data in polynomial-time. Theoretical Computer Science, 76(2-3), 223–242.
Article MathSciNet MATH Google Scholar
Schank, R.C. (1983). Dynamic memory: A theory of reminding and learning in computers and people Cambridge University Press.
Schank, R.C., & Abelson, R. (1977). Script, plans goals and understanding: an inquiry into human knowledge structures.
Simazu, H., Shibata, A., & Nihei, K. (2001). Expertguide: A conversational case-based reasoning tool for developing mentors in knowledge spaces. Applied Intelligence, 14(1), 33–48.
Article MATH Google Scholar
Spoerri, A. (1993). Infocrystal: A visual tool for information retrieval & management. In Proceedings of the second international conference on Information and knowledge management, CIKM ’93, pages 11–20, New York, NY, USA: ACM.
Tang, L., & Mooney, R.J. (2001). Using multiple clause constructors in inductive logic programming for semantic parsing. In Proceedings of the 12yh European Conference on Machine Learning (ECML-2001), pages 466–477, Freiburg, Germany.
Turney, P.D. (2002). Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, Philadelphia, Pennsylvania.
Unger, C., Hieber, F., & Cimiano, P. (2010). Generating ltag grammars from a lexicon-ontology interface. In Proceedings of the 10th International Workshop on Tree Adjoining Grammars and Related forMalisms (TAG+10), pages 61–68, Yale University 06/2010.
Waltz, D.L. (1978). An english language question answering system for a large relational database. Communications of the ACM, 21(7), 526–539.
Article MATH Google Scholar
Wang, C., Xiong, M., Zhou, Q., & Yu, Y. (2007). Panto – a portable natural language interface to ontologies. In 4th ESWC, Innsbruck (pp. 473–487): Springer-Verlag.
Weber, R.O., Ashley, K.D., & Brüninghaus, S. (2005). Textual case-based reasoning. The Knowledge Engineering Review, 20(03), 255–260.
Article Google Scholar
Wong, Y.W., & Mooney, R.J. (2006). Learning for semantic parsing with statistical machine translation. In Proceedings of Human Language Technology Conference / North American Chapter of the Association for Computational Linguistics Annual Meeting (HLT-NAACL-06) (pp. 439–446). New york city, NY.
Woods, W.A., Kaplan, R.M. , & Webber, B.N. (1972). The lunar sciences natural language information system: Final report. In BBN Report 2378. Cambridge, Massachusetts: Bolt Beranek and Newman Inc.
Zaanen, M.V. (2001). Bootstrapping Structure into Language: Alignment-based Learning. PhD thesis, School of Computing, University of Leeds U.K.
Zhang, D.-M., Sheng, H.-Y., Li, F., & Yao, T.-F. (2002). The model design of a case-based reasoning multilingual natural language interface for database. In Proceedings International Conference on Machine Learning and Cybernetics, 2002, (Vol. 3 pp. 1474–1478): IEEE.
Zhou, L. (2007). Natural language interface for information management on mobile devices. Behaviour & Information Technology, 26(3), 197–207.
Article Google Scholar

Download references

Acknowledgements

The authors would like to thank the Spanish ‘Ministerio de Educación y Ciencia’ and ‘Junta de Andalucía’ that supported this research with its project P09TIC5011. Also, we would like to thank the anonymous reviewers for their helpful comments.

Author information

Authors and Affiliations

Department of Computer Science and Artificial Intelligence, CITIC, University of Granada, Granada, Spain
A. Moreo, J. L. Castro & J. M. Zurita

Authors

A. Moreo
View author publications
You can also search for this author in PubMed Google Scholar
J. L. Castro
View author publications
You can also search for this author in PubMed Google Scholar
J. M. Zurita
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to A. Moreo.

Appendix: A: Entity relationship diagram of dataset domains

Figures 10, 11 and 12 show the Entity-Relationship diagram for Restbase, Geobase, and Jobdata domains, respectively.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Moreo, A., Castro, J.L. & Zurita, J.M. Towards portable natural language interfaces based on case-based reasoning. J Intell Inf Syst 49, 281–314 (2017). https://doi.org/10.1007/s10844-017-0453-8

Download citation

Received: 27 November 2014
Revised: 07 January 2017
Accepted: 16 February 2017
Published: 27 February 2017
Issue Date: October 2017
DOI: https://doi.org/10.1007/s10844-017-0453-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Towards portable natural language interfaces based on case-based reasoning

Abstract

Access this article

Similar content being viewed by others

Toward a knowledge-to-text controlled natural language of isiZulu

Bridging the Gap Between Formal Languages and Natural Languages with Zippers

Knowledge Authoring for Rule-Based Reasoning

Notes

References

Acknowledgements