Abstract
Theecology of languageis a concept introduced to clarify the relationships that hold between the use of language by people and the contexts in which those uses take place. Developing an effectiveecologyrequires access to large amounts of textual and lexical data. A number of projects are currently underway in the United States, in Europe, and in Japan to provide such materials in a form that allows them to be shared as community resources. This paper first briefly reviews a number of these efforts. Then it presents the results of a Workshop on Open Lexical and Textual Resources that was convened to discuss the value of amassing such collections. Consideration is subsequently given to types of corpora and procedures for designing and analyzing them. Three projects are singled out for special mention: (1) theText Encoding Initiative, which is formulating and disseminating guidelines for the preparation and dissemination of machine-readable texts for research and for use by the language industries; (2) theData Collection Initiative, which is acquiring and preparing a large text corpus to be made available for scientific research at cost and without royalties; and (3) theConsortium for Lexical Research, which is collecting and disseminating lexical resources and providing a clearinghouse for research results that make use of them. Understanding theecology of languageand analyzing the resources that contribute to it will prove increasingly important for electronic document delivery.
An earlier version of this paper appeared inProceedings of the International Workshop on Electronic Dictionaries, EDR TR-031, February 1991, Japan Electronic Dictionary Research Institute, Tokyo, Japan, 10–22.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Amsler, Robert A.; and Tompa, Frank William. 1988. “An SGML-based Standard for English Monolingual Dictionaries.” InInformation in Text: Proceedings of the 4th Annual Conference of the UW Centre for the New Oxford English Dictionary. University of Waterloo Centre for the New Oxford English Dictionary, Waterloo, Ontario. Pages 61–79.
Atkins, Sue; Clear, Jeremy; and Ostler, Nicholas. 1991. “Corpus Design Criteria.” Prepared for the Workshop on European Textual Corpora. Manuscript.
Biber, Douglas. 1988.Variation Across Speech and Writing. Cambridge University Press, Cambridge, England.
Biber, Douglas. 1989. “A Typology of English Texts.”Linguistics27, 3–43.
Biber, Douglas. 1991. “Representativeness in Corpus Design.” Chapter in present volume.
Brunner, Theodore E 1987. “Data Banks for the Humanities: Learning from Thesaurus Linguae Graecae.”Scholarly Communication 7, 1, 6–9.
Bryan, Martin. 1988.SGML: An Author’s Guide to the Standard Generalized Markup Language. Addison-Wesley, Wokingham, England and Reading, Massachusetts.
Church, Kenneth Ward. 1988. “A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text.”Proceedings of the Second Conference on Applied Natural Language Processing. Austin, Texas, 9–12 February 1988. Association for Computational Linguistics, Morristown, New Jersey, 136–143.
Garside, Roger; Leech, Geoffrey; and Sampson, Geoffrey (Editors). 1987.The Computational Analysis of English: A Corpus-Based Approach. Longman, London and New York.
Goldfarb, Charles. 1990.The SGML Handbook. Oxford University Press, Oxford.
Grishman, Ralph; and Kittredge, Richard (Editors). 1986.Analyzing Language in Restricted Domains. Lawrence Erlbaum Associates, Hillsdale, New Jersey.
Hindle, Donald. 1983. “Deterministic Parsing of Syntactic Non-fluencies.”Proceedings of the 21st Annual Meeting of the Association for Computational Linguistics.Cambridge, Massachusetts, 15–17 June 1983. Association for Computational Linguistics, Morristown, New Jersey, 123–128.
Hindle, Donald. 1989. “Acquiring Disambiguation Rules from Text.”Proceedings of the 27th Annual Meeting of the Association for Computational Linguistics. Vancouver, British Columbia, 26–29 June 1989. Association for Computational Linguistics, Morristown, New Jersey, 118–125.
Huberman, B.A. (Editor). 1988.The Ecology of Computation. North-Holland, Amsterdam.
Hymes, Dell. 1974.Foundations in Sociolinguistics. University of Pennsylvania Press, Philadelphia.
International Standards Organization. 1986. International Standard ISO 8879: Information Processing—Text and Office Systems—Standard Generalized Markup Language (SGML). American National Standards Institute, New York.
Johansson, Stig. 1985. “Word Frequency and Text Type: Some Observations Based on the LOB Corpus of British English Texts.”Computers and the Humanities19: 1, 23–26.
Kittredge, Richard; and Lehrberger, John (Editors). 1982. Sublanguage: Studies of Language in Restricted Semantic Domains. de Gruyter, Berlin.
Kucera, Henry; and Francis, W. Nelson. 1967.Computational Analysis of Present-Day American English. Brown University Press, Providence, Rhode Island.
Kurematsu, Akira. 1990. “A Perspective of Telephone Interpretation Research.”Proceedings of Pacific Rim International Conference on Artificial Intelligence ‘80, Nagoya, Japan, 14–16 November 1990. Japanese Society for Artificial Intelligence, Tokyo, Japan, 11–16.
McNaught, John. 1991. “Reusability of Lexical and Terminological Resources: Steps towards Independence.”Proceedings of the International Workshop on Electronic Dictionaries, EDR TR-031. Japan Electronic Dictionary Research Institute, Tokyo, Japan, 97–106.
Miike, Seiji. 1991. “How to Define Concepts for Electronic Dictionaries.”Proceedings of the International Workshop on Electronic Dictionaries, EDR TR-031. Japan Electronic Dictionary Research Institute, Tokyo, Japan, 43–49.
Nakao, Yoshio. 1991. “How to Extract Dictionary Data from the EDR Corpus.”Proceedings of the International Workshop on Electronic Dictionaries, EDR TR-031. Japan Electronic Dictionary Research Institute, Tokyo, Japan, 58–62.
Normier, Bernard; and Nossin, Marc. 1991. “GENELEX Project: EUREKA for Linguistic Engineering.”Proceedings of the International Workshop on Electronic Dictionaries, EDR TR-031. Japan Electronic Dictionary Research Institute, Tokyo, Japan, 63–69.
Palmer, Martha; and Finin, Tim. 1990. “Workshop on the Evaluation of Natural Language Processing Systems.”Computational Linguistics16: 3, 175–181.
Procter, Paul (Editor). 1978.Longman Dictionary of Contemporary English. Longman Group Limited, Harlow and London.
Proud, Judith K. 1989. “The Oxford Text Archive.”British Library R&D Report No.5985. British Library, London.
Sinclair, John M. (Editor). 1987a.Collins COBUILD English Language Dictionary. Collins, Glasgow.
Sinclair, John M. (Editor). 1987b.Looking Up: An Account of the COBUILD Project in Lexical Computing. Collins, Glasgow.
Sperberg-McQueen, C. M.; and Burnard, Lou (Editors). 1990. “Guidelines for the Encoding and Interchange of Machine-Readable Texts.” Draft: Version 1. 0, 15 July 1990.
Stefik, Mark J. 1988. “The Next Knowledge Medium.”The Ecology of Computation, edited by B.A. Huberman. North-Holland, Amsterdam, 315–342.
Summers, Della. 1991. “Longman Computerization Initiatives, Corpus Building, Semantic Analysis, and Prolog Version of LDOCE by Cheng-ming Guo.”Proceedings of the International Workshop on Electronic Dictionaries, EDR TR-031. Japan Electronic Dictionary Research Institute, Tokyo, Japan, 141–152.
Svartvik, Jan; and Quirk, Randolph (Editors). 1980. ACorpus of English Conversation. Gleerup, Lund.
Uchida, Hiroshi. 1991. “Electronic Dictionary.”Proceedings of the International Workshop on Electronic Dictionaries, EDR TR-031. Japan Electronic Dictionary Research Institute, Tokyo, Japan, 23–42.
van Herwijnen, Eric. 1990.Practical SGML. Kluwer, Dordrecht.
Walker, Donald E. 1986. “Knowledge Resource Tools for Accessing Large Text Files.” InInformation in Data: Proceedings of the First Conference of the University of Waterloo Center for the New Oxford English Dictionary. University of Waterloo Center for the New Oxford English Dictionary Waterloo, Ontario, 11–24.
Walker, Donald E. 1987. “Knowledge Resource Tools for Accessing Large Text Files.” InMachine Translation: Theoretical and Methodological Issues, edited by Sergei Nirenberg. Cambridge University Press, Cambridge, England, 247–261.
Walker, Donald E. 1989. “Developing Lexical Resources.” InDictionaries in. the Electronic Age: Proceedings of the 5th Annual Conference of the UW Centre for the New Oxford English Dictionary. University of Waterloo Centre for the New Oxford English Dictionary, Waterloo, Ontario, 1–22.
Walker, Donald E. 1990. “Collecting Texts, Tagging Texts, and Putting Texts in Context.” InText-Based Intelligent Systems: Current Research in Text Analysis,Information Extraction,and Retrieval, edited by Paul S. Jacobs. General Electric Research & Development Center Technical Information Series, 90CRD 198, Schenectady, New York, September 1990, 30–34.
Walker, Donald E.; and Amsler, Robert A. 1986. “The Use of Machine-Readable Dictionaries in Sublanguage Analysis.” InAnalyzing Language in Restricted Domains, edited by Ralph Grishman and Richard Kittredge. Lawrence Erlbaum Associates, Hillsdale, New Jersey, 69–83.
Yokoi, Toshio. 1991. “Collaboration and Cooperation for Development of Electronic Dictionaries.”Proceedings of the International Workshop on Electronic Dictionaries, EDR TR-031. Japan Electronic Dictionary Research Institute, Tokyo, Japan, 204–207.
Yokota, Eiji. 1991. “How to Organize a Concept Hierarchy.”Proceedings of the International Workshop on Electronic Dictionaries, EDR TR-031. Japan Electronic Dictionary Research Institute, Tokyo, Japan, 50–57.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1994 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Walker, D.E. (1994). The Ecology of Language. In: Zampolli, A., Calzolari, N., Palmer, M. (eds) Current Issues in Computational Linguistics: In Honour of Don Walker. Linguistica Computazionale, vol 9. Springer, Dordrecht. https://doi.org/10.1007/978-0-585-35958-8_19
Download citation
DOI: https://doi.org/10.1007/978-0-585-35958-8_19
Publisher Name: Springer, Dordrecht
Print ISBN: 978-0-7923-2998-5
Online ISBN: 978-0-585-35958-8
eBook Packages: Springer Book Archive