Syntactic Structural Kernels for Natural Language Interfaces to Databases

  • Alessandra Giordani
  • Alessandro Moschitti
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5781)


A core problem in data mining is to retrieve data in a easy and human friendly way. Automatically translating natural language questions into SQL queries would allow for the design of effective and useful database systems from a user viewpoint. Interesting previous work has been focused on the use of machine learning algorithms for automatically mapping natural language (NL) questions to SQL queries.

In this paper, we present many structural kernels and their combinations for inducing the relational semantics between pairs of NL questions and SQL queries. We measure the effectiveness of such kernels by using them in Support Vector Machines to select the queries that correctly answer to NL questions. Experimental results on two different datasets show that our approach is viable and that syntactic information under the form of pairs of syntactic tree fragments (from queries and questions) plays a major role in deriving the relational semantics between the two languages.


Natural Language Processing Kernel Methods Support Vector Machines 


  1. 1.
    Kate, R.J., Mooney, R.J.: Using string-kernels for learning semantic parsers. In: Proceedings of the 21st ICCL and 44th Annual Meeting of the ACL, Sydney, Australia, July 2006, pp. 913–920. Association for Computational Linguistics (2006)Google Scholar
  2. 2.
    Popescu, A.M., Etzioni, A.O., Kautz, A.H.: Towards a theory of natural language interfaces to databases. In: Proceedings of the 2003 International Conference on Intelligent User Interfaces, Miami, pp. 149–157. Association for Computational Linguistics (2003)Google Scholar
  3. 3.
    Minock, M., Olofsson, P., Näslund, A.: Towards building robust natural language interfaces to databases. In: Kapetanios, E., Sugumaran, V., Spiliopoulou, M. (eds.) NLDB 2008. LNCS, vol. 5039, pp. 187–198. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  4. 4.
    Zettlemoyer, L.S., Collins, M.: Learning to map sentences to logical form: Structured classification with probabilistic categorial grammars. In: UAI, pp. 658–666 (2005)Google Scholar
  5. 5.
    Wong, Y.W., Mooney, R.: Learning for semantic parsing with statistical machine translation. In: Proceedings of the Human Language Technology Conference of the NAACL, Main Conference, New York City, USA, June 2006, pp. 439–446. Association for Computational Linguistics (2006)Google Scholar
  6. 6.
    Dale, R., Somers, H.L., Moisl, H. (eds.): 9. In: Database Interfaces, pp. 209–240. Marcel Dekker Inc., New York (2000)Google Scholar
  7. 7.
    Tang, L.R., Mooney, R.J.: Using multiple clause constructors in inductive logic programming for semantic parsing. In: Proceedings of the 12th European Conference on Machine Learning, Freiburg, Germany, pp. 466–477 (2001)Google Scholar
  8. 8.
    Ge, R., Mooney, R.: A statistical semantic parser that integrates syntax and semantics. In: Proceedings of the Ninth Conference on Computational Natural Language Learning (CoNLL-2005), Ann Arbor, Michigan, June 2005, pp. 9–16. Association for Computational Linguistics (2005)Google Scholar
  9. 9.
    Winograd, T.: Understanding Natural Language. Academic Press, New York (1972)Google Scholar
  10. 10.
    Lodhi, H., Taylor, J.S., Cristianini, N., Watkins, C.J.C.H.: Text classification using string kernels. In: NIPS, pp. 563–569 (2000)Google Scholar
  11. 11.
    Collins, M., Duffy, N.: New ranking algorithms for parsing and tagging: Kernels over discrete structures, and the voted perceptron. In: Proceedings of ACL 2002 (2002)Google Scholar
  12. 12.
    Vishwanathan, S.V.N., Smola, A.J.: Fast kernels for string and tree matching. In: Advances in Neural Information Processing Systems, vol. 15, pp. 569–576. MIT Press, Cambridge (2003)Google Scholar
  13. 13.
    Moschitti, A.: Efficient convolution kernels for dependency and constituent syntactic trees. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 318–329. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  14. 14.
    Giordani, A., Moschitti, A.: Semantic mapping between natural language questions and sql queries via syntactic pairing. In: NLDB 2009: Proceedings of the 13th international conference on Natural Language and Information Systems (2009)Google Scholar
  15. 15.
    Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge (2004)CrossRefzbMATHGoogle Scholar
  16. 16.
    Zhang, D., Lee, W.S.: Question classification using support vector machines. In: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, pp. 26–32. ACM Press, New York (2003)CrossRefGoogle Scholar
  17. 17.
    Salton, G.: Recent trends in automatic information retrieval. In: SIGIR 1986, Proceedings of the 9th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Pisa, Italy, September 8-10, 1986, pp. 1–10. ACM, New York (1986)Google Scholar
  18. 18.
    Joachims, T.: Making large-scale SVM learning practical. In: Schölkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods (1999)Google Scholar
  19. 19.
    Moschitti, A., Quarteroni, S., Basili, R., Manandhar, S.: Exploiting syntactic and shallow semantic kernels for question/answer classification. In: Proceedings of ACL 2007, Prague, Czech Republic (2007)Google Scholar
  20. 20.
    Moschitti, A., Quarteroni, S.: Kernels on linguistic structures for answer extraction. In: Proceedings of ACL 2008: HLT, Short Papers, Columbus, Ohio (2008)Google Scholar
  21. 21.
    Chali, Y., Joty, S.: Improving the performance of the random walk model for answering complex questions. In: Proceedings of ACL 2008: HLT, Short Papers, Columbus, Ohio, pp. 9–12 (2008)Google Scholar
  22. 22.
    Shen, D., Lapata, M.: Using semantic roles to improve question answering. In: Proceedings of EMNLP-CoNLL (2007)Google Scholar
  23. 23.
    Surdeanu, M., Ciaramita, M., Zaragoza, H.: Learning to rank answers on large online QA collections. In: Proceedings of ACL 2008: HLT, Columbus, Ohio (2008)Google Scholar
  24. 24.
    Basili, R., Moschitti, A., Pazienza, M.: A text classifier based on linguistic processing. In: Proceedings of IJCAI 1999, Machine Learning for Information Filtering (1999)Google Scholar
  25. 25.
    Charniak, E.: A maximum-entropy-inspired parser. In: Proceedings of NAACL 2000 (2000)Google Scholar
  26. 26.
    Cancedda, N., Gaussier, E., Goutte, C., Renders, J.M.: Word sequence kernels. J. Mach. Learn. Res. 3, 1059–1082 (2003)MathSciNetzbMATHGoogle Scholar
  27. 27.
    Moschitti, A.: Kernel methods, syntax and semantics for relational text categorization. In: Proceeding of CIKM 2008, NY, USA (2008)Google Scholar
  28. 28.
    Moschitti, A., Bejan, C.: A semantic kernel for predicate argument classification. In: Proceedings of CoNLL 2004, Boston, MA, USA (2004)Google Scholar
  29. 29.
    Moschitti, A., Coppola, B., Pighin, D., Basili, R.: Engineering of syntactic features for shallow semantic parsing. In: Proceedings of ACL 2005 Workshop on Feature Engineering for Machine Learning in NLP, USA (2005)Google Scholar
  30. 30.
    Moschitti, A., Pighin, D., Basili, R.: Tree kernels for semantic role labeling. Computational Linguistics 34(2), 193–224 (2008)MathSciNetCrossRefGoogle Scholar
  31. 31.
    Moschitti, A., Zanzotto, F.: Fast and effective kernels for relational learning from texts. In: Ghahramani, Z. (ed.) Proceedings of the 24th Annual International Conference on Machine Learning, ICML 2007 (2007)Google Scholar
  32. 32.
    Moschitti, A., Pighin, D., Basili, R.: Semantic role labeling via tree kernel joint inference. In: Proceedings of CoNLL-X, New York City (2006)Google Scholar
  33. 33.
    Chandra, Y., Mihalcea, R.: Natural language interfaces to databases, University of North Texas, Thesis, M.S. (2006)Google Scholar
  34. 34.
    Kudo, T., Matsumoto, Y.: Fast Methods for Kernel-Based Text Analysis. In: Hinrichs, E., Roth, D. (eds.) Proceedings of ACL, pp. 24–31 (2003)Google Scholar
  35. 35.
    Cumby, C., Roth, D.: Kernel Methods for Relational Learning. In: Proceedings of ICML 2003, Washington, DC, USA, pp. 107–114 (2003)Google Scholar
  36. 36.
    Culotta, A., Sorensen, J.: Dependency Tree Kernels for Relation Extraction. In: ACL 2004, Barcelona, Spain, pp. 423–429 (2004)Google Scholar
  37. 37.
    Kudo, T., Suzuki, J., Isozaki, H.: Boosting-based parse reranking with subtree features. In: Proceedings of ACL 2005, US (2005)Google Scholar
  38. 38.
    Toutanova, K., Markova, P., Manning, C.: The Leaf Path Projection View of Parse Trees: Exploring String Kernels for HPSG Parse Selection. In: Proceedings of EMNLP 2004, Barcelona, Spain (2004)Google Scholar
  39. 39.
    Kazama, J., Torisawa, K.: Speeding up Training with Tree Kernels for Node Relation Labeling. In: Proceedings of EMNLP 2005, Toronto, Canada, pp. 137–144 (2005)Google Scholar
  40. 40.
    Shen, L., Sarkar, A., Joshi, A.k.: Using LTAG Based Features in Parse Reranking. In: EMNLP, Sapporo, Japan (2003)Google Scholar
  41. 41.
    Zhang, M., Zhang, J., Su, J.: Exploring Syntactic Features for Relation Extraction using a Convolution tree kernel. In: Proceedings of NAACL, New York City, USA, pp. 288–295 (2006)Google Scholar
  42. 42.
    Zhang, D., Lee, W.: Question classification using support vector machines. In: Proceedings of SIGIR 2003, Toronto, Canada. ACM Press, New York (2003)Google Scholar
  43. 43.
    Giuglea, A.M., Moschitti, A.: Semantic role labeling via framenet, verbnet and propbank. In: Proceedings of ACL 2006, Sydney, Australia (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Alessandra Giordani
    • 1
  • Alessandro Moschitti
    • 1
  1. 1.Department of Computer Science and EngineeringUniversity of TrentoPOVO (TN)Italy

Personalised recommendations