Abstract
This paper introduces a dynamic generalized parser aimed primarily at common natural mathematical language. Our algorithm combines the efficiency of GLR parsing, the dynamic extensibility of tableless approaches and the expressiveness of extended context-free grammars such as parallel multiple context-free grammars (PMCFGs). In particular, it supports efficient dynamic rule additions to the grammar at any moment. The algorithm is designed in a fully incremental way, allowing to resume parsing with additional tokens without restarting the parse process, and can predict possible next tokens. Additionally, we handle constraints on the token following a rule. This allows for grammatically correct English indefinite articles when working with word tokens. It can also represent typical operations for scannerless parsing such as maximal matches when working with character tokens. Our long-term goal is to computerize a large library of existing mathematical knowledge using the new parser, starting from natural language input as found in textbooks or in the papers collected by the digital mathematical library (DML) projects around the world. In this paper, we present the algorithmic ideas behind our approach, give a short overview of the implementation, and present some efficiency results. The new parser is available at http://www.tigen.org/kevin.kofler/fmathl/dyngenpar/ .
Support by the Austrian Science Fund (FWF) under contract numbers P20631 and P23554 is gratefully acknowledged.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Qt – Cross-platform application and UI framework, http://qt.nokia.com
Qt Jambi – The Qt library for Java, http://qt-jambi.org
Angelov, K.: Incremental parsing with parallel multiple context-free grammars. In: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, pp. 69–76 (2009)
Angelov, K., Bringert, B., Ranta, A.: PGF: A Portable Run-Time Format for Type-Theoretical Grammars. Journal of Logic, Language and Information 19(2), 201–228 (2010)
Costagliola, G., Deufemia, V., Polese, G.: Visual language implementation through standard compiler-compiler techniques. Journal of Visual Languages & Computing 18(2), 165–226 (2007); selected papers from Visual Languages and Computing 2005
Cramer, M., Fisseni, B., Koepke, P., Kühlwein, D., Schröder, B., Veldman, J.: The Naproche Project Controlled Natural Language Proof Checking of Mathematical Texts. In: Fuchs, N.E. (ed.) CNL 2009. LNCS, vol. 5972, pp. 170–186. Springer, Heidelberg (2010)
Flex Project: flex: The Fast Lexical Analyzer, http://flex.sourceforge.net
Free Software Foundation: Bison – GNU parser generator, http://www.gnu.org/software/bison
Free Software Foundation: GNU General Public License (GPL) v2.0 (June 1991), http://www.gnu.org/licenses/old-licenses/gpl-2.0
Free Software Foundation: GNU General Public License (GPL) v3.0 (June 2007), http://www.gnu.org/licenses/gpl-3.0
Hinze, R., Paterson, R.: Derivation of a typed functional LR parser (2003)
Humayoun, M.: Developing the System MathNat for Automatic Formalization of Mathematical texts. Ph.D. thesis, University of Grenoble (2012), http://www.lama.univ-savoie.fr/~humayoun/phd/mathnat.html
Kasami, T.: An efficient recognition and syntax analysis algorithm for context-free languages. Tech. Rep. AFCRL-65-758, Air Force Cambridge Research Laboratory, Bedford, MA (1965)
Koepke, P., Schröder, B., Buechel, G., et al.: Naproche – Natural language proof checking, http://www.naproche.net
Kofler, K.: DynGenPar – Dynamic Generalized Parser, http://www.tigen.org/kevin.kofler/fmathl/dyngenpar
Kofler, K., Neumaier, A.: The DynGenPar Algorithm on an Example, slides, http://www.tigen.org/kevin.kofler/fmathl/dyngenpar-example.pdf
Kofler, K., Neumaier, A.: A Dynamic Generalized Parser for Common Mathematical Language. In: Work-in-Progress Proceedings of CICM/MKM (2011), http://www.tigen.org/kevin.kofler/fmathl/dyngenpar-wip.pdf
Kohlhase, M.: Using LaTeX as a Semantic Markup Format. Mathematics in Computer Science 2.2, 279–304 (2008)
Mernik, M., Heering, J., Sloane, A.: When and how to develop domain-specific languages. ACM Computing Surveys (CSUR) 37(4), 316–344 (2005)
Neumaier, A.: Analysis und lineare Algebra, unpublished lecture notes, http://www.mat.univie.ac.at/~neum/FMathL/ALA.pdf
Neumaier, A.: FMathL – Formal Mathematical Language, http://www.mat.univie.ac.at/~neum/fmathl.html
Ranta, A.: Grammatical Framework: A Type-Theoretical Grammar Formalism. Journal of Functional Programming 14(2), 145–189 (2004)
Ranta, A., Angelov, K., Hallgren, T., et al.: GF – Grammatical Framework, http://www.grammaticalframework.org
Schodl, P.: Foundations for a Self-Reflective, Context-Aware Semantic Representation of Mathematical Specifications. Ph.D. thesis, University of Vienna (2011), http://www.mat.univie.ac.at/~schodl/pdfs/diss_online.pdf
Schodl, P., Neumaier, A.: An experimental grammar for German mathematical text. Tech. rep., University of Vienna (2009), http://www.mat.univie.ac.at/~neum/FMathL/ALA-grammar.pdf
Schodl, P., Neumaier, A.: The FMathL type system. Tech. rep., University of Vienna (2011), http://www.mat.univie.ac.at/~neum/FMathL/types.pdf
Schodl, P., Neumaier, A., Kofler, K., Domes, F., Schichl, H.: Towards a Self-reflective, Context-aware Semantic Representation of Mathematical Specifications. In: Kallrath, J. (ed.) Algebraic Modeling Systems – Modeling and Solving Real World Optimization Problems, ch. 2. Springer (2012)
Seki, H., Matsumura, T., Fujii, M., Kasami, T.: On multiple context-free grammars. Theoretical Computer Science 88(2), 191–229 (1991)
Tomita, M.: An Efficient Augmented Context-Free Parsing Algorithm. Computational Linguistics 13(1-2), 31–46 (1987)
Tomita, M., Ng, S.: The Generalized LR parsing algorithm. In: Tomita, M. (ed.) Generalized LR Parsing, pp. 1–16. Kluwer (1991)
Visser, E.: Scannerless generalized-LR parsing. Tech. Rep. P9707, Programming Research Group, University of Amsterdam (1997)
Younger, D.: Recognition and parsing of context-free languages in time n 3. Information and Control 10(2), 189–208 (1967)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kofler, K., Neumaier, A. (2012). DynGenPar – A Dynamic Generalized Parser for Common Mathematical Language. In: Jeuring, J., et al. Intelligent Computer Mathematics. CICM 2012. Lecture Notes in Computer Science(), vol 7362. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31374-5_26
Download citation
DOI: https://doi.org/10.1007/978-3-642-31374-5_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31373-8
Online ISBN: 978-3-642-31374-5
eBook Packages: Computer ScienceComputer Science (R0)