Preliminary Analysis of a Breadth-First Parsing Algorithm: Theoretical and Experimental Results

  • W. A. Martin
  • K. W. Church
  • R. S. Patil
Part of the Symbolic Computation book series (SYMBOLIC)


We will trace a brief history of context-free parsing algorithms and then describe some representation issues. The purpose of this paper is to share our philosophy and experience in adapting a well-known context-free parsing algorithm (Earley’s algorithm [9, 10] and variations thereof [29, 14, 27, 28]) to the parsing of a difficult and wide-ranging corpus of sentences. The sentences were gathered by Malhotra [23] in an experiment which fooled businessmen users into thinking they were interacting with a computer, when they were actually interacting with Malhotra in another room. The sentences are given in Appendix I. The MALHOTRA corpus is considerably more difficult than a second collection given in Appendix II (originally published in [16]). Section 4 compares empirical results obtained from these collections against theoretical predictions.


Noun Phrase Relative Clause Overhead Cost Prepositional Phrase Catalan Number 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Aho AV, Ullman JD (1972) The Theory of Parsing, Translation, and Compiling. Englewood Cliffs: Prentice-HallGoogle Scholar
  2. 2.
    Bar-Hillel Y, Gaifman C, Shamir E: On Categorial and Phrase Structure Grammars. The Bulletin of the Research Council of Israel, 9F, 1–16Google Scholar
  3. 3.
    Bresnan J (1981) The Passive in Lexical Theory. Occasional Paper No 7, Center for Cognitive Science, 1980. Also in: Bresnan, J (ed). Cambridge: MIT PressGoogle Scholar
  4. 4.
    Burton R (1976) Semantic Grammar: An Engineering Technique for Constructing Natural Language Understanding Systems. BBN Report No 3453Google Scholar
  5. 5.
    Chomsky N (1980) On Binding. Linguistic InquiryGoogle Scholar
  6. 6.
    Church K (1980) On Memory Limitations in Natural Language Processing. MIT/LCS/TR245 (also available from the Indiana University Linguistics Club)Google Scholar
  7. 7.
    Church K, Patil R (1983) Coping with Syntactic Ambiguity or How to Put the Block in the Box on the Table. MIT/LCS/TM-216. Also in: American Journal of Computational LinguisticsGoogle Scholar
  8. 8.
    Dostert B, Thompson F (1971) How Features Resolve Syntactic Ambiguity. In: Minker J, Rosenfeld S (eds ): Proceedings of the Symposium on Information Storage and RetreivalGoogle Scholar
  9. 9.
    Earley J (1968) An Efficient Context-Free Parsing Algorithm. Unpublished Ph. D. Thesis. Carnegie-Mellon UniversityGoogle Scholar
  10. 10.
    Earley J (1970) An Efficient Context-Free Parsing Algorithm. Communications of the ACM 13 (2)Google Scholar
  11. 11.
    Ford M, Bresnan J, Kaplan R (1981) A Competence-Based Theory of Syntactic Closure. Paper presented at the Sloan Workshop on Parsing Long Distance Dependencies, University of Massachusetts at Amherst, 1981. Also in: Bresnan J (ed). Cambridge: MIT PressGoogle Scholar
  12. 12.
    Gazdar G (1981) Unbounded Dependencies and Coordinate Structure. Linguistic Inquiry 12 (2)Google Scholar
  13. 13.
    Gazdar G: Phrase Structure Grammar. In: Jacobson P, Pullum G (eds): The Nature of Syntactic RepresentationGoogle Scholar
  14. 14.
    Graham S, Harrison M, Ruzzo W (1980) An Improved Context-Free Recognizer. ACM Transactions on Programming Languages and Systems 2 (3), 415–462MATHCrossRefGoogle Scholar
  15. 15.
    Harris L: Experience with ROBOT in 12 Commercial Natural Language Data Base Query Applications. IJCAI 79, p 365Google Scholar
  16. 16.
    Hendrix G, Sacerdoti E, Sagalowicz D, Slocum J (1978) Developing a Natural Language Interface to Complex Data. ACM Transactions on Database Systems 3 (2), 105–147CrossRefGoogle Scholar
  17. 17.
    Joos M (1968) The English Verb: Form and Meanings. Madison, Milwaukee, and London: The University of Wiscons in PressGoogle Scholar
  18. 18.
    Kaplan R (1972) Augmented Transition Networks as Psychological Models of Sentence Comprehension. Artificial Intelligence 3, 77–100MATHCrossRefGoogle Scholar
  19. 19.
    Kaplan R (1973) A General Syntacitc Processor. In: Rustin R (ed): Natural Language Processing. New York: Algorithmics PressGoogle Scholar
  20. 20.
    Kaplan R, Bresnan J (1981) Lexical-Functional Grammar: A Formal Systen for Grammatical Representation. Occasional Paper, Center for Cognitive Science, 1980. Also in: Bresnan J (ed). Cambridge: MIT PressGoogle Scholar
  21. 21.
    Knuth, D (1975) Fundamental Algorithms. In: The Art of Computer Programming, Vol 1. Reading: Addison-WesleyGoogle Scholar
  22. 22.
    Kuno, Susumu, Oettinger AG (1963) Multiple Path Syntactic Analyzer. In: Information Processing. Amsterdam: North-HollandGoogle Scholar
  23. 23.
    Malhotra A (1975) Design Criteria for a Knowledge-Based English Language System for Management: An Experimental Analysis. MIT/LCS/TR-146Google Scholar
  24. 24.
    Marcus M (1980) A Theory of Syntactic Recognition for Natural Language. Cambridge: MIT PressMATHGoogle Scholar
  25. 25.
    Mathlab Group (1977) Macsyma Reference Manual. Laboratory for Computer Science, MITGoogle Scholar
  26. 26.
    Milne R (1980) A Framework for Deterministic Parsing Using Syntax and Semantics. DAI Working Paper 64. Department of Artificial Intelligence, University of EdinburghGoogle Scholar
  27. 27.
    Pratt VR (1973) A Linguistics Oriented Programming Language. IJCAI 3Google Scholar
  28. 28.
    Pratt V (1975) Lingol- A Progress Report. IJCAI 4Google Scholar
  29. 29.
    Ruzzo WL (1978) General Context-Free Language Recognition. Unpublished Ph. D. Thesis. University of California, BerkeleyGoogle Scholar
  30. 30.
    Sheil B (1976) Observations on Context-Free Parsing. Statistical Methods in Linguistics, 71–109Google Scholar
  31. 31.
    Shipman D, Marcus M (1979) Towards Minimal Data Structures for Deterministic Parsing. IJCAI 79Google Scholar
  32. 32.
    Steele G (1980) The Definition and Implementation of a Computer Programming Language Based on Constraints. MIT, AI-TR-595Google Scholar
  33. 33.
    Valient L (1975) General Context Free Recognition in Less Than Cubic Time. J. Computer and System Sciences 10, 308–315CrossRefGoogle Scholar
  34. 34.
    Woods W (1970) Transition Network Grammars for Natural Language Analysis. Communications of the ACM 13 (10), 591–606MATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1987

Authors and Affiliations

  • W. A. Martin
  • K. W. Church
  • R. S. Patil

There are no affiliations available

Personalised recommendations