Developments in Parsing Technology: From Theory to Application

Bunt, Harry; Carroll, John; Satta, Giorgio

doi:10.1007/1-4020-2295-6_1

Developments in Parsing Technology: From Theory to Application

Harry Bunt¹⁵,
John Carroll¹⁶ &
Giorgio Satta¹⁷

Chapter

432 Accesses
1 Citations

Part of the book series: Text, Speech and Language Technology ((TLTB,volume 23))

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aho, A. and S.C. Johnson (1974). LR parsing. Computing Surveys, 6(2):99–124.
Article MATH Google Scholar
Aho, A. and J. Ullman (1972). The Theory of Parsing, Translation and Compiling. Englewood Cliffs, NJ: Prentice-Hall.
Google Scholar
Aho, A. and J. Ullman (1977). Principles of Compiler Design. Reading, MA: Addison-Wesley.
Google Scholar
ALPAC (1966). Languages and Machines: Computers in Translation and Linguistics. A Report by the Automatic Language Processing Advisory Committee. Washington, DC: National Academy of Sciences, National Research Council.
Google Scholar
Alshawi, H. (ed.) (1992). The Core Language Engine. Cambridge, MA: MIT Press.
Google Scholar
Alshawi, H., D. Carter, B. Gambäck, and M. Rayner (1992). Swedish-English QLF translation. In H. Alshawi (ed.), The Core Language Engine, pages 277–309. Cambridge, MA: MIT Press.
Google Scholar
Amtrup, J. (1999). Incremental Speech Translation. Lecture Notes in Computer Science 1735. Berlin: Springer Verlag.
MATH Google Scholar
Amtrup, J. (2000). Hypergraph unification-based parsing for incremental speech processing. In Proceedings of the Sixth International Workshop on Parsing Technologies (IWPT), pages 291–292, Trento, Italy.
Google Scholar
Aone, C., L. Halverson, T. Hampton, and M. Ramos-Santacruz (1998). SRA: description of the IE² system used for MUC-7. In Proceedings of the Seventh Message Understanding Conference MUC-7, Menlo Park, CA: Morgan Kaufman.
Google Scholar
Arppe, A. (1995). Term extractionfrom unrestricted text. Paper presented at the 10th Nordic Conference on Computational Linguistics, Helsinki, Finland. http://www.lingsoft.fi/doc/nptool/term-extraction.html.
Briscoe, E. and J. Carroll (1997). Automatic extraction of subcategorization from corpora. In Proceedings of the Fifth ACL Conference on Applied Natural Language Processing, pages 356–363, Washington, DC.
Google Scholar
Buchholz, S. (2002). Memory-Based Grammatical Relation Finding. Ph.D. Thesis, Tilburg University, The Netherlands.
Google Scholar
Buchholz, S. and W. Daelemans (2001). SHAPAQA: shallow parsing for question answering on the world wide web. In Proceedings of the EuroConference on Recent Advances in Natural Language Processing (RANLP), pages 47–51, Tzigov Chark, Bulgaria.
Google Scholar
Bunt, H. (1991). Parsing with discontinuous phrase structure grammar. In M. Tomita (ed.), Current Issues in Parsing Technology, pages 49–63. Dordrecht: Kluwer Academic Publishers.
Chapter Google Scholar
Bunt, H. (1996). Describing and processing discontinuous constituency structure. In H. Bunt and A. van Horck (ed.), Discontinuous Constituency, pages 63–83. Berlin: Mouton de Gruyter.
Chapter Google Scholar
Bunt, H. and A. van Horck (ed.) (1996). Discontinuous Constituency. Berlin: Mouton de Gruyter.
Google Scholar
Carroll, J. (1994). Relating complexity to practical performance in parsing with wide-coverage unification grammars. In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics, pages 287–294, Las Cruces, NM.
Google Scholar
Chakrabarti, S. (2002). Mining the Web. Menlo Park, CA: Morgan Kaufmann.
Google Scholar
Charniak, E. (2001). Immediate-head parsing for language modeling. In Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics, pages 116–123, Toulouse, France.
Google Scholar
Chelba, C. and F. Jelinek (1998). Exploiting syntactic structure for language modeling. In Proceedings of the 36th Annual Meeting of the ACL and the 19th Conference on Computational Linguistics (COLING-ACL), pages 225–231, Montreal, Canada.
Google Scholar
Chomsky, N. (1995). The Minimalist Program. Cambridge, MA: MIT Press.
MATH Google Scholar
Ciravegna, F. and A. Lavelli (1999). Full text parsing using cascading of rules: an information extraction perspective. In Proceedings of the Ninth Conference of the European Chapter of the ACL, pages 102–109, Bergen, Norway.
Google Scholar
Crouch, R., C. Condoravdi, R. Stolle, T. King, V. de Paiva, J. Everett, and D. Bobrow (2002). Scalability of redundancy detection in focused document collections. In Proceedings of the First International Conference on Scalable Natural Language Understanding (ScaNaLu), Heidelberg, Germany.
Google Scholar
Dagan, I., W. Gale, and K. Church (1993). Robust bilingual word alignment for machine aided translation. In Proceedings of the Workshop on Very large Corpora: Academic and Industrial Perspectives, pages 1–8, Columbus, OH.
Google Scholar
de la Clergerie, E. (2001). Refining tabular parsers for TAGs. In Proceedings of the Second Meeting of the North American Chapter of the ACL, pages 167–174, Pittsburgh, PA.
Google Scholar
Earley, J. (1970). An efficient context-free parsing algorithm. Communications of the ACM, 6:94–102.
Article Google Scholar
Evans, D., R. Lefferts, G. Grefenstette, S. Henderson, W. Hersh, and A. Archbold (1992). CLARIT: TREC design, experiments, and results. In Proceedings of the First Text Retrieval Conference, NIST Special Publication 500-207, Gaithersburg, MD: National Institute of Standards and Technology.
Google Scholar
Fensel, D., W. Wahlster, H. Lieberman, and J. Hendler (2002). Spinning the Semantic Web. Cambridge, MA: MIT Press.
Google Scholar
Flickinger, D., A. Copestake, and I. Sag (2000). HPSG analysis of English. In W. Wahlster (ed.), Verbmobil: Foundations of Speech-to-Speech Translation, pages 254–263. Berlin: Springer Verlag.
Chapter Google Scholar
Gale, W. and K. Church (1991). A program for aligning sentences in bilingual corpora. In Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics, pages 177–184, Berkeley, CA.
Google Scholar
Gavaldà, M. (2003) SOUP: A parser for real-world spontaneous speech. This volume.
Google Scholar
Gildea, D. (2003). Loosely Tree-Based Alignment for Machine Translation. In Proceedings of the 41th Annual Meeting of the Association for Computational Linguistics (ACL2003), pages 291–292, Sapporo, Japan.
Google Scholar
Graham, S.L., M.A. Harrison and W.L. Ruzzo (1980). An Improved Context-free Recognizer. ACM Transactions on Programming Languages and Systems, 2(3):415–462.
Article MATH Google Scholar
Grefenstette, G. (1993). Evaluation techniques for automatic semantic extraction. Comparing syntactic and window based approaches. In Proceedings of the ACL/SIGLEX Workshop on Acquisition of Lexical Knowledge from Text, Columbus, OH.
Google Scholar
Grefenstette, G. (1998). Cross-language Information Retrieval. Boston: Kluwer Academic Publishers.
Book Google Scholar
Grishman, R. (1995). The NYU system for MUC-6 or where’s the syntax?. In Proceedings of the Sixth Message Understanding Conference MUC-6, Menlo Park, CA: Morgan Kaufman.
Google Scholar
Grishman, R. (1997). Information extraction: techniques and challenges. In M. Pazienza (ed.), Information Extraction: a Multidisciplinary Approach to an Emerging Information Technology. Berlin: Springer Verlag.
Google Scholar
Hisamitsu, T., K. Marukawa, Y. Shima, H. Fujisawa, and Y. Nitta (1995). Optimal techniques for OCR error correction for Japanese texts. In Proceedings of the Third International Conference on Document Analysis and Recognition (Vol. 2), pages 1014–1017, Montreal, Canada.
Google Scholar
Hopcroft, J. and J. Ullman (1979). Introduction To Automata Theory, Languages and Computation. Reading, MA: Addison-Wesley.
MATH Google Scholar
Hull, D. and Grefenstette, G. (1996). Querying across languages: A dictionary-based approach to multilingual information retrieval. In Proceedings of the 19th ACM SIGIR Conference on Research and Development in Information Retrieval, pages 49–57, Zürich, Switzerland.
Google Scholar
Hutchins, W. (1986). Machine Translation: Past, Present, Future. Chichester, UK: Ellis Horwood.
Google Scholar
Huybregts, R. (1985). The weak inadequacy of context-free phrase structure grammars. In G. de Haan, M. Trommelen, and W. Zonneveld (ed.), Van Periferie Naar Kern, pages 81–99. Dordrecht: Foris.
Google Scholar
Jacquemin, C. and D. Bourigault (2003). Term extraction and automatic indexing. In R. Mitkov (ed.), Handbook of Computational Linguistics, pages 599–615. Oxford, UK: Oxford University Press.
Google Scholar
Justeson, J. and M. Katz (1995). Technical terminology: some linguistic properties and an algorithm for identification in text. Natural Language Engineering, 1(1):9–27.
Article Google Scholar
Kasami, T. (1965). An Efficient Recognition and Syntax Algorithm for Context-free Languages (Technical Report AFL-CRL-65-758). Bedford, MA: Air Force Cambridge Research Laboratory.
Google Scholar
Kasper, W., H.-U. Krieger, J. Spilker, and H. Weber (1996). From word hypotheses to logical form: an efficient interleaved approach. In Proceedings of the Natural Language Processing and Speech Technology: Results of the Third KONVENS Conference, pages 77–88, Berlin: Mouton de Gruyter.
Google Scholar
Kay, M. (1973). The MIND system. In R. Rustin (ed.), Natural Language Processing, pages 155–188. New York: Algorithmics Press.
Google Scholar
Klavans, J. and E. Tzoukermann (1995). Combining corpus and machine-readable dictionary data for building bilingual lexicons. Machine Translation, 10(3): 185–218.
Article Google Scholar
Knight, K. and D. Marcu (2000). Statistics-based summarization — step one: sentence compression. In Proceedings of the 17th National Conference on Artificial Intelligence (AAAI), pages 703–710, Austin, TX.
Google Scholar
Knuth, D (1965). On the translation of languages from left to right. Information and Control, 8(6):607–639.
Article MathSciNet Google Scholar
Lang, B. (1974). Deterministic techniques for efficient nondeterministic parsers. In J. Loeckx (ed.), Automata, Languages and Programming, 2nd Colloquium. Lecture Notes in Computer Science 14, pages 255–269. Berlin and Heidelberg: Springer Verlag.
Google Scholar
Langlais, P., M. Simard, and J. Véronis (1998). Methods and practical issues in evaluating alignment techniques. In Proceedings of the 36th Annual Meeting of the ACL and the 19th Conference on Computational Linguistics (COLING-ACL), pages 711–717, Montreal, Canada.
Google Scholar
Langley, C., A. Lavie, L. Levin, D. Wallace, D. Gates, and K. Peterson (2002). Spoken language parsing using phrase-level grammars and trainable classifiers. In Proceedings of the Speech-to-Speech Translation Workshop at the 40th Annual Meeting of the ACL, Philadelphia, PA.
Google Scholar
Lavie, A. (1996). GLR^*: A Robust Grammar-Focused Parser for Spontaneously Spoken Language. Ph.D. Thesis, School of Computer Science, Carnegie Mellon University.
Google Scholar
Lin, D. (1999). Automatic identification of non-compositional phrases. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics, pages 317–324, College Park, MD.
Google Scholar
Macklovitch, E. and M. Hannan (1996). Line’em up: advances in alignment technology and their impact on translation support tools. In Proceedings of the Second Conference of the Association for Machine Translation in the Americas (AMTA-96), Montreal, Canada.
Google Scholar
Maedche, A. and S. Staab (2001). Learning ontologies for the semantic web. IEEE Intelligent Systems, 16(2).
Google Scholar
Marcu, D. (1998). The Rhetorical Parsing, Summarization, and Generation of Natural Language Texts. Ph.D. Thesis, Department of Computer Science, University of Toronto.
Google Scholar
McCarthy, D. (2000). Using semantic preferences to identify verbal participation in role switching alternations. In Proceedings of the First Meeting of the North American Chapter of the ACL, pages 256–263, Seattle, WA.
Google Scholar
Moore, R. (2003) Improved left-corner chart parsing for large context-free grammars. This volume.
Google Scholar
Nederhof, M.-J. (1998). Context-free parsing through regular approximation. In Proceedings of the First International Workshop on Finite State Methods in Natural Language Processing (FSMLNP), pages 13–24, Ankara, Turkey.
Google Scholar
Nederhof, M.-J. (2000). Regular approximation of CFLs: a grammatical view. In H. Bunt and A. Nijholt (ed.), Advances in Probabilistic and Other Parsing Technologies. Dordrecht: Kluwer Academic Publishers, 221–241.
Chapter Google Scholar
Nie, J., P. Isabelle, P. Plamondon, and G. Foster (1998). Using a probabilistic translation model for cross-language information retrieval. In Proceedings of the Sixth Workshop on Very Large Corpora, pages 18–27, Montreal, Canada.
Google Scholar
Oepen, S. and J. Carroll (2000). Ambiguity packing in constraint-based parsing — practical results. In Proceedings of the First Meeting of the North American Chapter of the ACL, pages 162–169, Seattle, WA.
Google Scholar
Och, K, C. Tillmann, and H. Ney (1999). Improved alignment models for statistical machine translation. In Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, pages 20–28, College Park, MD.
Google Scholar
Pearce, D. (2001). Synonymy in collocation extraction. In Proceedings of the NAACL’ 01 Workshop on WordNet and Other Lexical Resources: Applications, Extensions and Customizations, Pittsburgh, PA.
Google Scholar
Perez-Cortez, J., J.-C. Amengual, J. Arlandis, and R. Llobet (2000). Stochastic error-correcting parsing for OCR. In Proceedings of the 15th International Conference on Pattern Recognition, pages 4402–4408, Barcelona, Spain.
Google Scholar
Pullum, G. (1983). Context-freeness and the computer processing of human languages. In Proceedings of the 21st Annual Meeting of the Association for Computational Linguisics, pages 1–6, Cambridge, MA.
Google Scholar
Pullum, G. (1984). On two recent attempts to show that English is not a context-free language. Computational Linguistics, 10(3–4):182–186.
Google Scholar
Pullum, G. and G. Gazdar (1982). Natural languages and context-free languages. Linguistics and Philosophy, 4(4):471–504.
Article Google Scholar
Radev, D., E. Hovy, and K. McKeown (ed.) (2002). Specialissue on summarization. Computational Linguistics, 28(4).
Google Scholar
Roark, B. (2001). Probabilistic top-down parsing and language modeling. Computational Linguistics, 27(2):249–285.
Article MathSciNet Google Scholar
Rosé, C. and A. Lavie (2001). Balancing robustness and efficiency in unification-augmented context-freeparsers for large practical applications. In G. van Noord and J.-C. Junqua (ed.), Robustness in Language and Speech Technology, pages 239–269. Dordrecht: Kluwer Academic Publishers.
Chapter Google Scholar
Sawaf, H., K. Schütz, and H. Ney (2000). On the use of grammar based language models for statistical machine translation. In Proceedings of the Sixth International Workshop on Parsing Technologies (IWPT), pages 231–241, Trento, Italy.
Google Scholar
Seki, H., T. Matsumura, M. Fujii, and T. Kasami (1991). On multiple context-free grammars. Theoretical Computer Science, 88:191–229.
Article MathSciNet MATH Google Scholar
Shieber, S. (1985). Evidence against the context-freeness of natural language. Linguistics and Philosophy, 8:333–343.
Article Google Scholar
Takexawa, T., T. Morimoto, Y. Sagisaka, N. Campbell, H. Iida, F. Sugaya, A. Yokoo, and S. Yamamoto (1998). A Japanese-to-English speech translation system: ATR-MATRIX. In Proceedings of the Fifth International Conference on Spoken Language Processing (ICSLP), pages 957–960, Sydney, Australia.
Google Scholar
Teufel, S. and M. Moens (2002). Summarizing scientific articles — experiments with relevance and rhetorical status. Computational Linguistics, 28(4):409–445.
Article Google Scholar
Tomita, M. (1986). An efficient word lattice parsing algorithm for continuous speech recognition. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1569–1572, Tokyo, Japan.
Google Scholar
Valderrábanos, A., A. Belskis, and L. Moreno (2002). Terminology extraction and validation. In Proceedings of the Third International Conference on Language Resources and Evaluation (LREC), pages 2163–2170, Las Palmas, Canary Islands.
Google Scholar
Vogel, S., F. Och, C. Tilmann, S. Niessen, H. Sawaf, and H. Ney (2000). Statistical models for machine translation. In W. Wahlster (ed.), Verbmobil: Foundations of Speech-to-Speech Translation, pages 377–393. Berlin: Springer Verlag.
Chapter Google Scholar
Wahlster, W. (2000a). Mobile speech-to-speech translation of spontaneous dialogs: an overview of the final Verbmobil system. In W. Wahlster (ed.), Verbmobil: Foundations of Speech-to-Speech Translation, pages 3–21. Berlin: Springer Verlag.
Chapter Google Scholar
Wahlster, W. (2000b). Verbmobil: Foundations of Speech-to-Speech Translation. Berlin: Springer Verlag.
Book Google Scholar
Ward, W. (1991). Understanding spontaneous speech: the Phoenix system. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 365–367, Toronto, Canada.
Google Scholar
Yamada, K. and K. Knight (2001). A Syntax-based Statistical Translation Model. In Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics (ACL2001), pages 523–530, Toulouse, France.
Google Scholar
Younger, D. (1967). PatRecognition and parsing of context-free languages in time n ³. Information and Control, 10(2): 189–208.
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Computational Linguistics and Artificial Intelligence, Tilburg University, PO Box 90153, 5000 LE, Tilburg, The Netherlands
Harry Bunt
Cognitive and Computing Sciences, University of Sussex, Falmer, Brighton, BN1 9QH, UK
John Carroll
Department of Information Engineering, University of Padua, Via Gradenigo 6/A, 35131, Padua, Italy
Giorgio Satta

Authors

Harry Bunt
View author publications
You can also search for this author in PubMed Google Scholar
John Carroll
View author publications
You can also search for this author in PubMed Google Scholar
Giorgio Satta
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Tilburg University, Tilburg, The Netherlands
Harry Bunt
University of Sussex, Brighton, UK
John Carroll
University of Padua, Padua, Italy
Giorgio Satta

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Bunt, H., Carroll, J., Satta, G. (2004). Developments in Parsing Technology: From Theory to Application. In: Bunt, H., Carroll, J., Satta, G. (eds) New Developments in Parsing Technology. Text, Speech and Language Technology, vol 23. Springer, Dordrecht. https://doi.org/10.1007/1-4020-2295-6_1

Download citation

DOI: https://doi.org/10.1007/1-4020-2295-6_1
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-2293-7
Online ISBN: 978-1-4020-2295-1
eBook Packages: Humanities, Social Sciences and Law

Publish with us

Policies and ethics

Buying options