A Model-Based Multilingual Natural Language Parser — Implementing Chomsky’s X-bar Theory in ModelCC

  • Luis Quesada
  • Fernando Berzal
  • Juan-Carlos Cubero
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8132)


Natural language support is a powerful feature that enhances user interaction with query systems. NLP requires dealing with ambiguities. Traditional probabilistic parsers provide a convenient means for disambiguation. However, they incorrigibly return wrong sequences of tokens, they impose hard constraints on the way lexical and syntactic ambiguities can be resolved, and they are limited in the mechanisms they allow for taking context into account. In comparison, model-based parser generators allow for flexible constraint specification and reference resolution, which facilitates the context consideration. In this paper, we explain how the ModelCC model-based parser generator supports statistical language models and arbitrary probability estimators. Then, we present the ModelCC implementation of a natural language parser based on the syntax of most Romance and Germanic languages. This natural language parser can be instantiated for a specific language by connecting it with a thesaurus (for lexical analysis), a linguistic corpus (for syntax-driven disambiguation), and an ontology or semantic database (for semantics-driven disambiguation).


Natural languages disambiguation query parsing 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
  2. 2.
    Charniak, E.: Statistical parsing with a context-free grammar and word statistics. In: Proc. AAAI 1997, pp. 598–603 (1997)Google Scholar
  3. 3.
    Chomsky, N.: Remarks on nominalization. In: Jacobs, R., Rosenbaum, P. (eds.) Readings in English Transformational Grammar, pp. 184–221 (1970)Google Scholar
  4. 4.
    Collins, M.: Head-driven statistical models for natural language parsing. Computational Linguistics 29(4), 589–637 (2003)MathSciNetzbMATHCrossRefGoogle Scholar
  5. 5.
    Fodor, J.A.: The Language of Thought. Crowell Press (1975)Google Scholar
  6. 6.
    Fowler, M.: Using metadata. IEEE Software 19(6), 13–17 (2002)CrossRefGoogle Scholar
  7. 7.
    Jurafsky, D., Martin, J.H.: Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition, 2nd edn. Prentice Hall (2009)Google Scholar
  8. 8.
    Kleppe, A.: Towards the generation of a text-based IDE from a language metamodel. In: Akehurst, D.H., Vogel, R., Paige, R.F. (eds.) ECMDA-FA. LNCS, vol. 4530, pp. 114–129. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  9. 9.
    Markov, A.A.: Dynamic Probabilistic Systems (Volume I: Markov Models). In: Howard, R. (ed.) Extension of the Limit Theorems of Probability Theory to a Sum of Variables Connected in a Chain, pp. 552–577. John Wiley & Sons (1971)Google Scholar
  10. 10.
    Nawrocki, J.R.: Conflict detection and resolution in a lexical analyzer generator. Information Processing Letters 38(6), 323–328 (1991)zbMATHCrossRefGoogle Scholar
  11. 11.
    Ney, H.: Dynamic programming parsing for context-free grammars in continuous speech recognition. IEEE Transactions on Signal Processing 39(2), 336–340 (1991)zbMATHCrossRefGoogle Scholar
  12. 12.
    Quesada, L.: A model-driven parser generator with reference resolution support. In: Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering, pp. 394–397 (2012)Google Scholar
  13. 13.
    Quesada, L., Berzal, F., Cubero, J.-C.: A language specification tool for model-based parsing. In: Yin, H., Wang, W., Rayward-Smith, V. (eds.) IDEAL 2011. LNCS, vol. 6936, pp. 50–57. Springer, Heidelberg (2011)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Luis Quesada
    • 1
  • Fernando Berzal
    • 1
  • Juan-Carlos Cubero
    • 1
  1. 1.Department of Computer Science and Artificial Intelligence, CITICUniversity of GranadaGranadaSpain

Personalised recommendations