Abstract
This paper presents work in progress on an algorithm to track and identify changes in the vocabulary used to describe particular concepts over time, with emphasis on treating concepts as distinct from changes in word meaning. We apply the algorithm to word vectors generated from Google Books n-grams from 1800–1990 and evaluate the induced networks with respect to their flexibility (robustness to changes in vocabulary) and stability (they should not leap from topic to topic). We also describe work in progress using the British National Biography Linked Open Data Serials to construct a “ground truth” evaluation dataset for algorithms which aim to detect shifts in the vocabulary used to describe concepts. Finally, we discuss limitations of the proposed method, ways in which the method could be improved in the future, and other considerations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Rather than thinking of concepts in a way that strongly links them to a particular lexeme (e.g., “the concept of justice”), we have argued elsewhere that it is preferable to think of concepts (at least insofar as they are expressed in discourse) in terms of their functions, one of which is to permit two interlocutors to sense that they have arrived at a common understanding of the matter under discussion. This is rather different and more abstract than the notion of a concept as being equivalent to a class in a classical ontology, and more specific than a theme or topic. However, for purposes of clarity and compatibility with the way related work speaks about “concepts,” our use of the word in this paper roughly conforms to the vague OED definition of “a general idea or notion.” We are explicitly not using it to refer to “the meaning that is realized by a word or expression.”.
- 2.
Because the threshold is initially set so high that no such subgraph can be found, this method ensures that the first subgraph discovered which meets these criteria is the one desired.
- 3.
Note that every node in the subgraph must correspond to a unique word.
- 4.
Available: http://ilps.science.uva.nl/resources/shifts/.
References
Seiler, T.B., Wannenmacher, W. (eds.): Concept Development and the Development of Word Meaning, vol. 12. Springer Science & Business Media, Berlin (2012)
Margolis, E., Laurence, S.: Concepts. In: Zalta, E.N. (ed.) The Stanford Encyclopedia of Philosophy (2014). http://plato.stanford.edu/archives/spr2014/entries/concepts/
Fodor, J.A.: The Language of Thought. Crowell, New York (1975)
Clark, E.V.: Meaning and concepts. In: Mussen, P.H. (ed.) Handbook of Child Psychology, vol. 3. Cognitive Development, pp. 787–840. Wiley, New York (1983)
Murphy, G.: The Big Book of Concepts. MIT Press, Cambridge (2002)
Glanzberg, M.: Meaning, concepts, and the lexicon. Croatian J. Philos. 11(1), 1–29 (2011)
OED Online: “Broadcast”. Oxford University Press. http://www.oed.com
Hamilton, W.L., Leskovec, J., Jurafsky, D.: Diachronic word embeddings reveal statistical laws of semantic change. arXiv preprint arXiv:1605.09096 (2016)
Wevers, M., Kenter, T., Huijnen, P.: Concepts through time: tracing concepts in Dutch Newspaper Discourse (1890–1990) using word embeddings. In: Digital Humanities 2015, Sydney (2015)
Kenter, T., Wevers, M., Huijnen, P.: Ad hoc monitoring of vocabulary shifts over time. In: Proceeding of 24th ACM International on Conference on Information and Knowledge Management, pp. 1191–1200. ACM, New York (2015)
Wang, X., McCallum, A.: Topics over time: a Non-Markov Continuous-Time Model of topical trends. In: Proceedings of 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 424–433. ACM, New York (2006)
Blei, D.M., Lafferty, J.D.: Dynamic topic models. In: Proceeding of 23rd International Conference on Machine Learning, pp. 113–120 (2006)
Hall, D., Jurafsky, D., Manning, C.D.: Studying the history of ideas using topic models. In: Proceedings, Conference on Empirical Methods on Natural Language Processing (EMNLP), pp. 363–371. Association for Computational Linguistics, East Stroudsburg, Pennsylvania (2008)
Sigrist, R., Rawat, V.: Topic evolution in a stream of documents. In: Proceedings of SIAM International Conference on Data Mining. SIAM, Philadelphia (2009)
Gulordava, K., Baroni, M.: A distributional similarity approach to the detection of semantic change in the Google Books N-gram corpus. In: Proceedings of the EMNLP 2011 Geometrical Models for Natural Language Semantics (GEMS) Workshop. Association for Computational Linguistics, East Stroudsburg, Pennsylvania (2011)
Wijaya, D.T., Yeniterzi, R. Understanding semantic change of words over centuries. In: Proceeding DETECT, International Workshop on DETecting and Exploiting Cultural diversiTy on the Social Web, pp. 35–40. ACM, New York (2011)
Pesquita, C., Couto, F.M.: Predicting the extension of biomedical ontologies. PLoS Comput. Biol. 8(9), e1002630 (2012)
Early English Books Online. Web, 02 October 2017. https://eebo.chadwyck.com/search?SCREEN=search_advanced.htx
OED Online: “Biology”. Oxford University Press. http://www.oed.com
Recchia, G., Jones, E., Nulty, P., Regan, J., de Bolla, P.: Tracing shifting conceptual vocabularies through time. In: Proceeding of Detection, Representation and Management of Concept Drift in Linked Open Data (Drift-a-LOD), Bologna, Italy, 20 November 2016, pp. 2–9 (2016). CEUR-WS.org/Vol-1799/Drift-a-LOD2016_paper_1.pdf
Acknowledgments
This paper is a revision of a paper that previously appeared in the 2016 CEUR Workshop Proceedings [20], which was invited to be included, after expansion and revision, into the present volume. The research presented here was supported by a private donation to the Cambridge Centre for Digital Knowledge (CCDK) at the University of Cambridge.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Recchia, G., Jones, E., Nulty, P., Regan, J., de Bolla, P. (2017). Tracing Shifting Conceptual Vocabularies Through Time. In: Ciancarini, P., et al. Knowledge Engineering and Knowledge Management. EKAW 2016. Lecture Notes in Computer Science(), vol 10180. Springer, Cham. https://doi.org/10.1007/978-3-319-58694-6_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-58694-6_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-58693-9
Online ISBN: 978-3-319-58694-6
eBook Packages: Computer ScienceComputer Science (R0)