Advertisement

The Ecology of Language

  • Donald E. Walker
Chapter
Part of the Linguistica Computazionale book series (LICO, volume 9)

Abstract

Theecology of languageis a concept introduced to clarify the relationships that hold between the use of language by people and the contexts in which those uses take place. Developing an effectiveecologyrequires access to large amounts of textual and lexical data. A number of projects are currently underway in the United States, in Europe, and in Japan to provide such materials in a form that allows them to be shared as community resources. This paper first briefly reviews a number of these efforts. Then it presents the results of a Workshop on Open Lexical and Textual Resources that was convened to discuss the value of amassing such collections. Consideration is subsequently given to types of corpora and procedures for designing and analyzing them. Three projects are singled out for special mention: (1) theText Encoding Initiative, which is formulating and disseminating guidelines for the preparation and dissemination of machine-readable texts for research and for use by the language industries; (2) theData Collection Initiative, which is acquiring and preparing a large text corpus to be made available for scientific research at cost and without royalties; and (3) theConsortium for Lexical Research, which is collecting and disseminating lexical resources and providing a clearinghouse for research results that make use of them. Understanding theecology of languageand analyzing the resources that contribute to it will prove increasingly important for electronic document delivery.

Keywords

Machine Translation Computational Linguistics Oxford English Dictionary Lexical Resource British National Corpus 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    Amsler, Robert A.; and Tompa, Frank William. 1988. “An SGML-based Standard for English Monolingual Dictionaries.” InInformation in Text: Proceedings of the 4th Annual Conference of the UW Centre for the New Oxford English Dictionary. University of Waterloo Centre for the New Oxford English Dictionary, Waterloo, Ontario. Pages 61–79.Google Scholar
  2. [2]
    Atkins, Sue; Clear, Jeremy; and Ostler, Nicholas. 1991. “Corpus Design Criteria.” Prepared for the Workshop on European Textual Corpora. Manuscript.Google Scholar
  3. [3]
    Biber, Douglas. 1988.Variation Across Speech and Writing. Cambridge University Press, Cambridge, England.CrossRefGoogle Scholar
  4. [4]
    Biber, Douglas. 1989. “A Typology of English Texts.”Linguistics27, 3–43.CrossRefGoogle Scholar
  5. [5]
    Biber, Douglas. 1991. “Representativeness in Corpus Design.” Chapter in present volume.Google Scholar
  6. [6]
    Brunner, Theodore E 1987. “Data Banks for the Humanities: Learning from Thesaurus Linguae Graecae.”Scholarly Communication 7, 1, 6–9.Google Scholar
  7. [7]
    Bryan, Martin. 1988.SGML: An Author’s Guide to the Standard Generalized Markup Language. Addison-Wesley, Wokingham, England and Reading, Massachusetts.Google Scholar
  8. [8]
    Church, Kenneth Ward. 1988. “A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text.”Proceedings of the Second Conference on Applied Natural Language Processing. Austin, Texas, 9–12 February 1988. Association for Computational Linguistics, Morristown, New Jersey, 136–143.Google Scholar
  9. [9]
    Garside, Roger; Leech, Geoffrey; and Sampson, Geoffrey (Editors). 1987.The Computational Analysis of English: A Corpus-Based Approach. Longman, London and New York.Google Scholar
  10. [10]
    Goldfarb, Charles. 1990.The SGML Handbook. Oxford University Press, Oxford.Google Scholar
  11. [11]
    Grishman, Ralph; and Kittredge, Richard (Editors). 1986.Analyzing Language in Restricted Domains. Lawrence Erlbaum Associates, Hillsdale, New Jersey.Google Scholar
  12. [12]
    Hindle, Donald. 1983. “Deterministic Parsing of Syntactic Non-fluencies.”Proceedings of the 21st Annual Meeting of the Association for Computational Linguistics.Cambridge, Massachusetts, 15–17 June 1983. Association for Computational Linguistics, Morristown, New Jersey, 123–128.Google Scholar
  13. [13]
    Hindle, Donald. 1989. “Acquiring Disambiguation Rules from Text.”Proceedings of the 27th Annual Meeting of the Association for Computational Linguistics. Vancouver, British Columbia, 26–29 June 1989. Association for Computational Linguistics, Morristown, New Jersey, 118–125.Google Scholar
  14. [14]
    Huberman, B.A. (Editor). 1988.The Ecology of Computation. North-Holland, Amsterdam.Google Scholar
  15. [15]
    Hymes, Dell. 1974.Foundations in Sociolinguistics. University of Pennsylvania Press, Philadelphia.Google Scholar
  16. [16]
    International Standards Organization. 1986. International Standard ISO 8879: Information Processing—Text and Office Systems—Standard Generalized Markup Language (SGML). American National Standards Institute, New York.Google Scholar
  17. [17]
    Johansson, Stig. 1985. “Word Frequency and Text Type: Some Observations Based on the LOB Corpus of British English Texts.”Computers and the Humanities19: 1, 23–26.CrossRefGoogle Scholar
  18. [18]
    Kittredge, Richard; and Lehrberger, John (Editors). 1982. Sublanguage: Studies of Language in Restricted Semantic Domains. de Gruyter, Berlin.Google Scholar
  19. [19]
    Kucera, Henry; and Francis, W. Nelson. 1967.Computational Analysis of Present-Day American English. Brown University Press, Providence, Rhode Island.Google Scholar
  20. [20]
    Kurematsu, Akira. 1990. “A Perspective of Telephone Interpretation Research.”Proceedings of Pacific Rim International Conference on Artificial Intelligence ‘80, Nagoya, Japan, 14–16 November 1990. Japanese Society for Artificial Intelligence, Tokyo, Japan, 11–16.Google Scholar
  21. [21]
    McNaught, John. 1991. “Reusability of Lexical and Terminological Resources: Steps towards Independence.”Proceedings of the International Workshop on Electronic Dictionaries, EDR TR-031. Japan Electronic Dictionary Research Institute, Tokyo, Japan, 97–106.Google Scholar
  22. [22]
    Miike, Seiji. 1991. “How to Define Concepts for Electronic Dictionaries.”Proceedings of the International Workshop on Electronic Dictionaries, EDR TR-031. Japan Electronic Dictionary Research Institute, Tokyo, Japan, 43–49.Google Scholar
  23. [23]
    Nakao, Yoshio. 1991. “How to Extract Dictionary Data from the EDR Corpus.”Proceedings of the International Workshop on Electronic Dictionaries, EDR TR-031. Japan Electronic Dictionary Research Institute, Tokyo, Japan, 58–62.Google Scholar
  24. [24]
    Normier, Bernard; and Nossin, Marc. 1991. “GENELEX Project: EUREKA for Linguistic Engineering.”Proceedings of the International Workshop on Electronic Dictionaries, EDR TR-031. Japan Electronic Dictionary Research Institute, Tokyo, Japan, 63–69.Google Scholar
  25. [25]
    Palmer, Martha; and Finin, Tim. 1990. “Workshop on the Evaluation of Natural Language Processing Systems.”Computational Linguistics16: 3, 175–181.Google Scholar
  26. [26]
    Procter, Paul (Editor). 1978.Longman Dictionary of Contemporary English. Longman Group Limited, Harlow and London.Google Scholar
  27. [27]
    Proud, Judith K. 1989. “The Oxford Text Archive.”British Library R&D Report No.5985. British Library, London.Google Scholar
  28. [28]
    Sinclair, John M. (Editor). 1987a.Collins COBUILD English Language Dictionary. Collins, Glasgow.Google Scholar
  29. [29]
    Sinclair, John M. (Editor). 1987b.Looking Up: An Account of the COBUILD Project in Lexical Computing. Collins, Glasgow.Google Scholar
  30. [30]
    Sperberg-McQueen, C. M.; and Burnard, Lou (Editors). 1990. “Guidelines for the Encoding and Interchange of Machine-Readable Texts.” Draft: Version 1. 0, 15 July 1990.Google Scholar
  31. [31]
    Stefik, Mark J. 1988. “The Next Knowledge Medium.”The Ecology of Computation, edited by B.A. Huberman. North-Holland, Amsterdam, 315–342.Google Scholar
  32. [32]
    Summers, Della. 1991. “Longman Computerization Initiatives, Corpus Building, Semantic Analysis, and Prolog Version of LDOCE by Cheng-ming Guo.”Proceedings of the International Workshop on Electronic Dictionaries, EDR TR-031. Japan Electronic Dictionary Research Institute, Tokyo, Japan, 141–152.Google Scholar
  33. [33]
    Svartvik, Jan; and Quirk, Randolph (Editors). 1980. ACorpus of English Conversation. Gleerup, Lund.Google Scholar
  34. [34]
    Uchida, Hiroshi. 1991. “Electronic Dictionary.”Proceedings of the International Workshop on Electronic Dictionaries, EDR TR-031. Japan Electronic Dictionary Research Institute, Tokyo, Japan, 23–42.Google Scholar
  35. [35]
    van Herwijnen, Eric. 1990.Practical SGML. Kluwer, Dordrecht.CrossRefGoogle Scholar
  36. [36]
    Walker, Donald E. 1986. “Knowledge Resource Tools for Accessing Large Text Files.” InInformation in Data: Proceedings of the First Conference of the University of Waterloo Center for the New Oxford English Dictionary. University of Waterloo Center for the New Oxford English Dictionary Waterloo, Ontario, 11–24.Google Scholar
  37. [37]
    Walker, Donald E. 1987. “Knowledge Resource Tools for Accessing Large Text Files.” InMachine Translation: Theoretical and Methodological Issues, edited by Sergei Nirenberg. Cambridge University Press, Cambridge, England, 247–261.Google Scholar
  38. [38]
    Walker, Donald E. 1989. “Developing Lexical Resources.” InDictionaries in. the Electronic Age: Proceedings of the 5th Annual Conference of the UW Centre for the New Oxford English Dictionary. University of Waterloo Centre for the New Oxford English Dictionary, Waterloo, Ontario, 1–22.Google Scholar
  39. [39]
    Walker, Donald E. 1990. “Collecting Texts, Tagging Texts, and Putting Texts in Context.” InText-Based Intelligent Systems: Current Research in Text Analysis,Information Extraction,and Retrieval, edited by Paul S. Jacobs. General Electric Research & Development Center Technical Information Series, 90CRD 198, Schenectady, New York, September 1990, 30–34.Google Scholar
  40. [40]
    Walker, Donald E.; and Amsler, Robert A. 1986. “The Use of Machine-Readable Dictionaries in Sublanguage Analysis.” InAnalyzing Language in Restricted Domains, edited by Ralph Grishman and Richard Kittredge. Lawrence Erlbaum Associates, Hillsdale, New Jersey, 69–83.Google Scholar
  41. [41]
    Yokoi, Toshio. 1991. “Collaboration and Cooperation for Development of Electronic Dictionaries.”Proceedings of the International Workshop on Electronic Dictionaries, EDR TR-031. Japan Electronic Dictionary Research Institute, Tokyo, Japan, 204–207.Google Scholar
  42. [42]
    Yokota, Eiji. 1991. “How to Organize a Concept Hierarchy.”Proceedings of the International Workshop on Electronic Dictionaries, EDR TR-031. Japan Electronic Dictionary Research Institute, Tokyo, Japan, 50–57.Google Scholar

Copyright information

© Springer Science+Business Media Dordrecht 1994

Authors and Affiliations

  • Donald E. Walker
    • 1
  1. 1.Artificial Intelligence and Information Science Research BellcoreUSA

Personalised recommendations