Abstract
The 2016 CICLing conference was dedicated to the memory of Adam Kilgarriff who died the year before. Adam leaves behind a tremendous scientific legacy and those working in computational linguistics, other fields of linguistics and lexicography are indebted to him. This paper is a summary review of some of Adam’s main scientific contributions. It is not and cannot be exhaustive. It is written by only a small selection of his large network of collaborators. Nevertheless we hope this will provide a useful summary for readers wanting to know more about the origins of work, events and software that are so widely relied upon by scientists today, and undoubtedly will continue to be so in the foreseeable future.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
In this paper, natural language processing (NLP) is used synonymously with computational linguistics.
- 2.
Like Oxford, the University of Sussex, where Adam undertook his doctoral training, uses DPhil rather than PhD as the abbreviation for its doctoral degrees.
- 3.
The company he founded is Lexical Computing Ltd. He was also a partner – with Sue Atkins and Michael Rundell – in another company, Lexicography MasterClass, which provides consultancy and training and runs the Lexicom workshops in lexicography and lexical computing; http://www.lexmasterclass.com/.
- 4.
This paper is perhaps Adam’s most influential piece, having been reprinted in three different collections since its original publication.
- 5.
- 6.
In fact, the title is a quote which Adam attributes to Sue Atkins.
- 7.
The Sketch Engine, described in Sect. 6, in particular is an incredibly valuable resource that is used regularly at Colorado for revising English VerbNet class memberships and developing PropBank frame files for several languages.
- 8.
See Fig. 1, below.
- 9.
Working papers can be found online at http://wackybook.sslmit.unibo.it.
- 10.
- 11.
- 12.
Personal communication from Adam to Serge Sharoff.
- 13.
Other examples include his eagerness to encourage participants in evaluations such as Senseval, reminding people to focus on analysis rather than who came top [42] and in his company’s aim of ‘corpora for all’.
- 14.
References
Atkins, S.: Tools for computer-aided corpus lexicography: the hector project. Acta Linguistica Hungarica 41, 5–72 (1993)
Atkins, S., Rundell, M., Kilgarriff, A.: Database of ANalysed Texts of English (DANTE). In: Proceedings of Euralex (2010)
Banko, M., Brill, E.: Scaling to very very large corpora for natural language disambiguation. In: ACL, pp. 26–33 (2001)
Baroni, M., Kilgarriff, A., Pomikálek, J., Rychlý, P.: WebBootCat: a web tool for instant corpora. In: Proceedings of Euralex, Torino, Italy, pp. 123–132 (2006)
Baroni, M., Kilgarriff, A., Pomikálek, J., Rychlỳ, P.: WebBootCaT: instant domain-specific corpora to support human translators. In: Proceedings of EAMT, pp. 247–252 (2006)
Copestake, A.: Implementing Typed Feature Structure Grammars. CSLI Lecture Notes. CSLI Publications, Stanford (2002). http://opac.inria.fr/record=b1098622
Dale, R., Kilgarriff, A.: Helping our own: text massaging for computational linguistics as a new shared task. In: Proceedings of the 6th International Natural Language Generation Conference, pp. 263–267. Association for Computational Linguistics (2010)
Erjavec, T., Evans, R., Ide, N., Kilgarriff, A.: The concede model for lexical databases. In: Proceedings of the Second International Conference on Language Resources and Evaluation, pp. 355–362. Athens, Greece (2000)
Evans, R., Gazdar, G.: DATR: a language for lexical knowledge representation. Comput. Linguist. 22(2), 167–216 (1996). http://eprints.brighton.ac.uk/11552/
Gale, W., Church, K., Yarowsky, D.: One sense per discourse. In: Proceedings of the 4th DARPA Speech and Natural Language Workshop, pp. 233–237 (1992)
Gardner, S., Nesi, H.: A classification of genre families in university student writing. Appl. Linguist. 34(1), 25–52 (2012). ams024
Gilquin, G., Granger, S., Paquot, M.: Learner corpora: the missing link in EAP pedagogy. J. Engl. Acad. Purp. 6(4), 319–335 (2007)
Hanks, P.: Do word meanings exist? Comput. Humanit. 34(1–2), 205–215 (2000). SENSEVAL Special Issue
Ide, N., Kilgarriff, A., Romary, L.: A formal model of dictionary structure and content. In: Heid, U., Evert, S., Lehmann, E., Rohrer, C. (eds.) Proceedings of the 9th EURALEX International Congress. Institut für Maschinelle Sprachverarbeitung, Stuttgart, Germany, pp. 113–126, August 2000
Ide, N., Véronis, J.: Encoding dictionaries. Comput. Humanit. 29(2), 167–179 (1995). http://dx.doi.org/10.1007/BF01830710
Jakubíček, M., Rychlý, P., Kilgarriff, A., McCarthy, D.: Fast syntactic searching in very large corpora for many languages. In: PACLIC 24 Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation, Tokyo, pp. 741–747 (2010)
Kallas, J., Tuulik, M., Langemets, M.: The basic Estonian dictionary: the first monolingual L2 learner’s dictionary of Estonian. In: Proceedings of the XVI Euralex Congress (2014)
Kilgarriff, A., Kovar, V., Frankenberg-Garcia, A.: Bilingual word sketches: three flavours. In: Electronic Lexicography in the 21st Century: Thinking outside the Paper (eLex 2013), pp. 17–19 (2013)
Kilgarriff, A.: Polysemy. Ph.D. thesis, University of Sussex (1992)
Kilgarriff, A.: Dictionary word-sense distinctions: an enquiry into their nature. Comput. Humanities 26(1–2), 365–387 (1993)
Kilgarriff, A.: The hard parts of lexicography. Int. J. Lexicography 11(1), 51–54 (1997)
Kilgarriff, A.: Putting frequencies in the dictionary. Int. J. Lexicography 10(2), 135–155 (1997)
Kilgarriff, A.: What is word sense disambiguation good for? In: Proceedings of Natural Language Processing in the Pacific Rim, pp. 209–214 (1997)
Kilgarriff, A.: Gold standard datasets for evaluating word sense disambiguation programs. Comput. Speech Lang. 12(3), 453–472 (1998)
Kilgarriff, A.: I don’t believe in word senses. Comput. Humanit. 31(2), 91–113 (1998). Reprinted in Practical Lexicography: a Reader. Fontenelle (ed.) Oxford University Press (2008). Also reprinted in Polysemy: Flexible patterns of meaning in language and mind Nerlich Todd, Herman and Clarke (eds.) Walter de Gruyter, pp. 361–392. And to be reprinted in Readings in the Lexicon Pustejovsky and Wilks (eds.) MIT Press
Kilgarriff, A.: SENSEVAL: an exercise in evaluating word sense disambiguation programs. In: Proceedings of LREC, Granada, pp. 581–588 (1998)
Kilgarriff, A.: Comparing corpora. Int. J. Corpus Linguist. 6(1), 1–37 (2001)
Kilgarriff, A.: Language is never ever ever random. Corpus Linguist. Linguist. Theor. 1(2), 263–276 (2005)
Kilgarriff, A.: Collocationality (and how to measure it). In: Proceedings of the 12th EURALEX International Congress, Torino, Italy, September 2006, pp. 997–1004 (2006)
Kilgarriff, A.: Word senses. In: Agirre, E., Edmonds, P. (eds.) Word Sense Disambiguation, Algorithms and Applications, pp. 29–46. Springer, Heidelberg (2006). https://doi.org/10.1007/978-1-4020-4809-8
Kilgarriff, A.: Googleology is bad science. Comput. Linguist. 33(1), 147–151 (2007)
Kilgarriff, A.: Grammar is to meaning as the law is to good behaviour. Corpus Linguist. Linguist. Theor 3(2), 195–197 (2007)
Kilgarriff, A.: Simple maths for keywords. In: Proceedings of Corpus Linguistics, Liverpool, UK (2009)
Kilgarriff, A.: Comparable corpora within and across languages, word frequency lists and the kelly project. In: Procedings of Workshop on Building and Using Comparable Corpora at LREC, Malta (2010)
Kilgarriff, A.: A detailed, accurate, extensive, available English lexical database. In: Proceedings of the NAACL HLT 2010 Demonstration Session, pp. 21–24. Association for Computational Linguistics, Los Angeles, June 2010. http://www.aclweb.org/anthology/N10-2006
Kilgarriff, A., Baisa, V., Bušta, J., Jakubíček, M., Kovář, V., Michelfeit, J., Rychlý, P., Suchomel, V.: The Sketch Engine: ten years on. Lexicography 1(1), 7–36 (2014). http://dx.doi.org/10.1007/s40607-014-0009-9
Kilgarriff, A., Charalabopoulou, F., Gavrilidou, M., Johannessen, J.B., Khalil, S., Kokkinakis, S.J., Lew, R., Sharoff, S., Vadlapudi, R., Volodina, E.: Corpus-based vocabulary lists for language learners for nine languages. Lang. Resour. Eval. 48(1), 121–163 (2014)
Kilgarriff, A., Evans, R., Koeling, R., Rundell, M., Tugwell, D.: WASPBENCH: a lexicographer’s workbench supporting state-of-the-art word sense disambiguation. In: Proceedings of the Tenth Conference on European Chapter of the Association for Computational Linguistics, EACL 2003, vol. 2, pp. 211–214. Association for Computational Linguistics, Stroudsburg (2003). https://doi.org/10.3115/1067737.1067787
Kilgarriff, A., Grefenstette, G.: Introduction to the special issue on web as corpus. Comput. Linguist. 29(3), 333–347 (2003)
Kilgarriff, A., Husák, M., McAdam, K., Rundell, M., Rychlý, P.: GDEX: automatically finding good dictionary examples in a corpus. In: Proceedings of the 13th EURALEX International Congress, Barcelona, Spain, July 2008, pp. 425–432 (2008)
Kilgarriff, A., Jakubíček, M., Kovář, V., Rychlý, P., Suchomel, V.: Finding terms in corpora for many languages with the Sketch Engine. In: EACL 2014, p. 53 (2014)
Kilgarriff, A., Palmer, M.: Introduction to the special issue on SENSEVAL. Comput. Humanit. 34(1–2), 1–13 (2000). SENSEVAL Special Issue
Kilgarriff, A., Palmer, M. (eds.): SENSEVAL98: Evaluating Word Sense Disambiguation Systems, pp. 1–2. Kluwer, Dordrecht (2000)
Kilgarriff, A., Rosenzweig, J.: Framework and results for English SENSEVAL. Comput. Humanit. 34(1–2), 15–48 (2000). SENSEVAL Special Issue
Kilgarriff, A., Rychlý, P., Kovář, V., Baisa, V.: Finding multiwords of more than two words. In: Proceedings of EURALEX 2012 (2012)
Kilgarriff, A., Rychlý, P.: Semi-automatic dictionary drafting download. In: de Schryver, G.M. (ed.) A Way with Words: Recent Advances in Lexical Theory and Analysis. A Festschrift for Patrick Hanks, Menha (2010)
Kilgarriff, A., Rychlỳ, P., Jakubicek, M., Kovár, V., Baisa, V., Kocincová, L.: Extrinsic corpus evaluation with a collocation dictionary task. In: LREC, pp. 545–552 (2014)
Kilgarriff, A., Rychlý, P., Smrz, P., Tugwell, D.: The sketch engine. In: Proceedings of Euralex, Lorient, France, pp. 105–116 (2004). Reprinted in Patrick Hanks (ed.) (2007). Lexicology: Critical Concepts in Linguistics. Routledge, London
Kilgarriff, A., Tugwell, D.: WASP-Bench: an MT lexicographer’s workstation supporting state-of-the-art lexical disambiguation. In: Proceedings of the MT Summit VIII, Santiago de Compostela, Spain, pp. 187–190, September 2001
Kosem, I., Gantar, P., Krek, S.: Automation of lexicographic work: an opportunity for both lexicographers and crowd-sourcing. In: Electronic Lexicography in the 21st Century: Thinking Outside the Paper: Proceedings of the eLex 2013 Conference, Tallinn, Estonia, 17–19 October 2013, pp. 32–48 (2013)
Kosem, I., Husák, M., McCarthy, D.: GDEX for slovene. In: Proceedings of eLex2011, Bled, Slovenia (2011)
Krek, S., Abel, A., Tiberius, C.: ENeL Project: DWS/CQS Survey Analysis (2015). http://www.elexicography.eu/wp-content/uploads/2015/04/ENeL_WG3_Vienna_DWS_CQS_final_web.pdf
Leech, G.: 100 million words of English: the British national corpus (BNC). Lang. Res. 28(1), 1–13 (1992)
Louw, B., Chateau, C.: Semantic prosody for the 21st century: are prosodies smoothed in academic contexts? A contextual prosodic theoretical perspective. In: Proceedings of the tenth JADT Conference on Statistical Analysis of Textual Data, pp. 754–764. Citeseer (2010)
Baroni, M., Chantree, F., Kilgarriff, A., Sharoff, S.: CleanEval: a competition for cleaning web pages. In: Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC 2008), Marrakech, Morocco, pp. 638–643 (2008)
Mautner, G.: Mining large corpora for social information: the case of elderly. Lang. Soc. 36(01), 51–72 (2007)
McCarthy, D., Kilgarriff, A., Jakubíček, M., Reddy, S.: Semantic word sketches. In: 8th International Corpus Linguistics Conference (CL 2015) (2015)
McEnery, T., Wilson, A.: Corpus Linguistics. Edinburgh University Press, Edinburgh (1999)
Mihalcea, R., Chklovski, T., Kilgarriff, A.: The SENSEVAL-3 English lexical sample task. In: Mihalcea, R., Edmonds, P. (eds.) Proceedings SENSEVAL-3 Second International Workshop on Evaluating Word Sense Disambiguation Systems, Barcelona, Spain, pp. 25–28 (2004)
Nastase, V., Sayyad-Shirabad, J., Sokolova, M., Szpakowicz, S.: Learning noun-modifier semantic relations with corpus-based and WordNet-based features. In: Proceedings of the National Conference on Artificial Intelligence, vol. 21, no. 1, p. 781. AAAI Press/MIT Press, Menlo Park, Cambridge, London 1999 (2006)
O’Donovan, R., O’Neill, M.: A systematic approach to the selection of neologisms for inclusion in a large monolingual dictionary. In: Proceedings of the XIII EURALEX International Congress, Barcelona, 15–19 July 2008, pp. 571–579 (2008)
Peters, W., Kilgarriff, A.: Discovering semantic regularity in lexical resources. Int. J. Lexicography 13(4), 287–312 (2000)
Pomikálek, J., Rychlỳ, P., Kilgarriff, A., et al.: Scaling to billion-plus word corpora. Adv. Comput. Linguist. 41, 3–13 (2009)
Preiss, J., Yarowsky, D. (eds.): Proceedings of SENSEVAL-2 Second International Workshop on Evaluating Word Sense Disambiguation Systems, Toulouse, France (2001). sIGLEX Workshop Organized by Cotton, S., Edmonds, P., Kilgarriff, A., Palmer, M
Rundell, M.: Macmillan English Dictionary. Macmillan, Oxford (2002)
Rundell, M., Kilgarriff, A.: Automating the creation of dictionaries: where will it all end? In: Meunier, F. et al. (eds.) A Taste for Corpora. In Honour of Sylviane Granger, pp. 257–281. Benjamins, Amsterdam (2011)
Rychlý, P.: Korpusové manažery a jejich efektiví implementace. Ph.D. thesis, Masaryk University, Brno (únor 2000)
Rychlý, P.: Manatee/Bonito - a modular corpus manager. In: Proceedings of Recent Advances in Slavonic Natural Language Processing 2007. Masaryk University, Brno (2007)
Rychlý, P.: A lexicographer-friendly association score. In: Proceedings of Recent Advances in Slavonic Natural Language Processing, RASLAN 2008, pp. 6–9 (2008)
Sharoff, S.: Creating general-purpose corpora using automated search engine queries. In: Baroni, M., Bernardini, S. (eds.) WaCky! Working Papers on the Web as Corpus, Gedit, Bologna (2006)
Sinclair, J.: The lexical item. In: Weigand, E. (ed.) Contrastive Lexical Semantics. Benjamins, Amsterdam (1998)
Tugwell, D., Kilgarriff, A.: WASP-Bench: a lexicographic tool supporting word-sense disambiguation. In: Preiss, J., Yarowsky, D. (eds.) Proceedings of SENSEVAL-2 Second International Workshop on Evaluating Word Sense Disambiguation Systems, Toulouse, France (2001)
Tugwell, D., Kilgarriff, A.: Word sketch: extraction and display of significant collocations for lexicography. In: Proceedings of the ACL Workshop on Collocations, Toulouse, France, pp. 32–28 (2001)
Wellner, B., Pustejovsky, J., Havasi, C., Rumshisky, A., Saurí, R.: Classification of discourse coherence relations: an exploratory study using multiple knowledge sources. In: Proceedings of the 7th SIGdial Workshop on Discourse and Dialogue, SigDIAL 2006, pp. 117–125, Association for Computational Linguistics, Stroudsburg (2006). http://dl.acm.org/citation.cfm?id=1654595.1654618
Yarowsky, D.: One sense per collocation. In: Proceedings of the ARPA Workshop on Human Language Technology, pp. 266–271. Morgan Kaufman (1993)
Yarowsky, D.: Unsupervised word sense disambiguation rivaling supervised methods. In: Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics, pp. 189–196 (1995)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Evans, R. et al. (2018). Adam Kilgarriff’s Legacy to Computational Linguistics and Beyond. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2016. Lecture Notes in Computer Science(), vol 9623. Springer, Cham. https://doi.org/10.1007/978-3-319-75477-2_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-75477-2_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-75476-5
Online ISBN: 978-3-319-75477-2
eBook Packages: Computer ScienceComputer Science (R0)