Abstract
Over the past few years, significant progress has been made in efficient processing with wide-coverage HPSG grammars. HPSG-based parsing systems are now available that can process medium-complexity sentences (of ten to twenty words, say) in average parse times equivalent to real (i.e. human reading) time. A large number of engineering improvements in current HPSG systems have been achieved through collaboration of multiple research centers and mutual exchange of experience, encoding techniques, algorithms, and even pieces of software. This article presents an approach to grammar and system engineering, termed competence & performance profiling, that makes systematic experimentation and the precise empirical study of system properties a focal point in development. Adapting the profiling metaphor familiar from software engineering to constraint-based grammars and parsers enables developers to maintain an accurate record of system evolution, identify grammar and system deficiencies quickly, and compare to earlier versions or between different systems. We discuss a number of example problems that motivate the experimental approach, and apply the empirical methodology in a fairly detailed discussion of progress made during a development period of three years.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aït-Kaci, H. (1991). Warren’s Abstract Machine: A Tutorial Reconstruction. MIT Press, Cambridge, MA.
Bouma, G. and van Noord, G. (1993). Head-driven parsing for lexicalist grammars. Experimental results. In Proceedings of the 6th Conference of the European Chapter of the ACL, pages 71–80, Utrecht, The Netherlands.
Callmeier, U. (2000). PET — A platform for experimentation with efficient HPSG processing techniques. Natural Language Engineering, 6(1) (Special Issue on Efficient Processing with HPSG):99–108.
Calder, J. (1993). Graphical interaction with constraint-based grammars. In Proceedings of the 3rd Pacific Rim Conference on Computational Linguistics, pages 160–169, Vancouver, BC, Canada.
Callmeier, U. (2002). Preprocessing and encoding techniques in PET. In Oepen, S., Flickinger, D., Tsujii, J., and Uszkoreit, H., editors, Collaborative Language Engineering. A Case Study in Efficient Grammar-based Processing. CSLI Publications, Stanford, CA.
Carpenter, B. (1992). The Logic of Typed Feature Structures. Cambridge University Press, Cambridge, UK.
Carroll, J. (1994). Relating complexity to practical performance in parsing with wide-coverage unification grammars. In Proceedings of the 32nd Meeting of the Association for Computational Linguistics, pages 287–294, Las Cruces, NM.
Carroll, J., Copestake, A., Flickinger, D., and Poznanski, V. (1999). An efficient chart generator for (semi-)lexicalist grammars. In Proceedings of the 7th European Workshop on Natural Language Generation, pages 86–95, Toulouse, France.
Ciortuz, L. (2001). LIGHT. A feature constraint language applied to parsing with large-scale HPSG grammars. Unpublished DFKI research report, Deutsches Forschungszentrum für Künstliche Intelligenz, Saarbrücken, Germany.
Copestake, A. (1992). The ACQUILEX LKB. Representation issues in semiautomatic acquisition of large lexicons. In Proceedings of the 3rd ACL Conference on Applied Natural Language Processing, pages 88–96, Trento, Italy.
Erbach, G. (1991a). An environment for experimenting with parsing strategies. In Mylopoulos, J. and Reiter, R., editors, Proceedings of the 12th International Joint Conference on Artificial Intelligence, pages 931–937, San Mateo, CA. Morgan Kaufmann Publishers.
Erbach, G. (1991b). A flexible parser for a linguistic development environment. In Herzog, O. and Rollinger, C.-R., editors, Text Understanding in LILOG, pages 74–87. Springer, Berlin, Germany.
Flickinger, D. (2000). On building a more efficient grammar by exploiting types. Natural Language Engineering, 6 (1) (Special Issue on Efficient Processing with HPSG): 15–28.
Flickinger, D., Nerbonne, J., Sag, I. A., and Wasow, T. (1987). Toward evaluation of NLP systems. Technical report, Hewlett-Packard Laboratories. Distributed at the 24th Annual Meeting of the Association for Computational Linguistics.
Flickinger, D. P. and Sag, I. A. (1998). Linguistic Grammars Online. A multipurpose broad-coverage computational grammar of English. In CSLI Bulletin 1999, pages 64–68, Stanford, CA. CSLI Publications.
Gerdemann, D. (1995). Term encoding of typed feature structures. In Proceedings of the 4th International Workshop on Parsing Technologies, pages 89–97, Prague, Czech Republic.
Götz, T. (1993). A normal form for typed feature structures. Magisterarbeit, Universität Tübingen, Tübingen, Germany.
Kay, M. (1989). Head-driven parsing. In Proceedings of the 1st International Workshop on Parsing Technologies, pages 52–62, Pittsburgh, PA.
Kiefer, B., Krieger, H.-U., Carroll, J., and Malouf, R. (1999). A bag of useful techniques for efficient and robust parsing. In Proceedings of the 37th Meeting of the Association for Computational Linguistics, pages 473–480, College Park, MD.
Krieger, H.-U. and Schäfer, U. (1994). TDL — A type description language for constraint-based grammars. In Proceedings of the 15th International Conference on Computational Linguistics, pages 893–899, Kyoto, Japan.
Malouf, R., Carroll, J., and Copestake, A. (2000). Efficient feature structure operations without compilation. Natural Language Engineering, 6 (1) (Special Issue on Efficient Processing with HPSG):29–46.
van Noord, G. (1997). An efficient implementation of the head-corner parser. Computational Linguistics, 23 (3):425–456.
van Noord, G. and Bouma, G. (1997). HDRUG: A flexible and extendible development environment for natural language processing. In Proceedings of the EACL/ACL workshop ENVGRAM, Madrid, Spain.
Oepen, S. and Carroll, J. (2000). Performance profiling for parser engineering. Natural Language Engineering, 6 (1) (Special Issue on Efficient Processing with HPSG):81–97.
Oepen, S., Flickinger, D., Tsujii, J., and Uszkoreit, H., editors (2002). Collaborative Language Engineering. A Case Study in Efficient Grammar-based Processing. CSLI Publications, Stanford, CA.
Oepen, S. and Flickinger, D. P. (1998). Towards systematic grammar profiling. Test suite technology ten years after. Journal of Computer Speech and Language, 12 (4) (Special Issue on Evaluation):411–436.
Oepen, S., Netter, K., and Klein, J. (1997). TSNLP — Test Suites for Natural Language Processing. In Nerbonne, J., editor, Linguistic Databases, pages 13–36. CSLI Publications, Stanford, CA.
Shakespeare, W. (1623). Measure for Measure. I. Iaggard and E. Blount, London, UK, first folio edition.
Siegel, M. (2000). HPSG analysis of Japanese. In Wahlster, W., editor, Verbmobil. Foundations of Speech-to-Speech Translation, pages 265–280. Springer, Berlin, Germany.
Tomabechi, H. (1991). Quasi-destructive graph unification. In Proceedings of the 29th Meeting of the Association for Computational Linguistics, pages 315–322, Berkeley, CA.
Torisawa, K. and Tsujii, J. (1996). Computing phrasal signs in HPSG prior to parsing. In Proceedings of the 16th International Conference on Computational Linguistics, pages 949–955, Copenhagen, Denmark.
Uszkoreit, H., Backofen, R., Busemann, S., Diagne, A. K., Hinkelman, E. A., Kasper, W., Kiefer, B., Krieger, H.-U., Netter, K., Neumann, G., Oepen, S., and Spackman, S. P. (1994). DISCO — an HPSG-based NLP system and its application for appointment scheduling. In Proceedings of the 15th International Conference on Computational Linguistics, Kyoto, Japan.
Wahlster, W., editor (2000). Verbmobil. Foundations of Speech-to-Speech Translation. Springer, Berlin, Germany.
Wroblewski, D. A. (1987). Non-destructive graph unification. In Proceedings of the 6th National Conference on Artificial Intelligence, pages 582–587, Seattle, WA.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Kluwer Academic Publishers
About this chapter
Cite this chapter
Oepen, S., Callmeier, U. (2004). Measure for Measure: Towards Increased Component Comparability and Exchange. In: Bunt, H., Carroll, J., Satta, G. (eds) New Developments in Parsing Technology. Text, Speech and Language Technology, vol 23. Springer, Dordrecht. https://doi.org/10.1007/1-4020-2295-6_19
Download citation
DOI: https://doi.org/10.1007/1-4020-2295-6_19
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-2293-7
Online ISBN: 978-1-4020-2295-1
eBook Packages: Humanities, Social Sciences and Law