Abstract
Language change is a complex social phenomenon, revealing pathways of communication and sociocultural influence. But, while language change has long been a topic of study in sociolinguistics, traditional linguistic research methods rely on circumstantial evidence, estimating the direction of change from differences between older and younger speakers. In this paper, we use a data set of several million Twitter users to track language changes in progress. First, we show that language change can be viewed as a form of social influence: we observe complex contagion for phonetic spellings and “netspeak” abbreviations (e.g., lol), but not for older dialect markers from spoken language. Next, we test whether specific types of social network connections are more influential than others, using a parametric Hawkes process model. We find that tie strength plays an important role: densely embedded social ties are significantly better conduits of linguistic influence. Geographic locality appears to play a more limited role: we find relatively little evidence to support the hypothesis that individuals are more influenced by geographically local social ties, even in their usage of geographical dialect markers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The basic unit of linguistic differentiation is referred to as a “variable” in the sociolinguistic and dialectological literature [50]. We maintain this terminology here.
- 2.
After running SAGE to identify words with coefficients above 2.0, we manually removed hashtags, named entities, non-English words, and descriptions of events.
- 3.
Other sources, such as http://urbandictionary.com, report asl to be an abbreviation of age, sex, location? However, this definition is not compatible with typical usage on Twitter, e.g., currently hungry asl or that movie was funny asl.
- 4.
ard, inna, and lls appear on multiple cities’ lists. These words are characteristic of the neighboring cities of Baltimore, Philadelphia, and Washington D.C.
- 5.
The shuffle test assumes that the likelihood of two users forming a social network connection does not change over time. Researchers have proposed a test [32] that removes this assumption; we will scale this test to our data set in future work.
- 6.
We also compared the full feature set—i.e., F1+F2+F3+F4—to feature set F1+F2+F3 and feature set F1+F2+F4. The results were almost identical, indicating that F3 (tie strength) and F4 (local) provide complementary information.
References
Adamic, L.A., Adar, E.: Friends and neighbors on the web. Soc. Netw. 25(3), 211–230 (2003)
Al Zamal, F., Liu, W., Ruths, D.: Homophily and latent attribute inference: inferring latent attributes of Twitter users from neighbors. In: Proceedings of the International Conference on Web and Social Media (ICWSM), pp. 387–390 (2012)
Alim, H.S.: Hip hop nation language. In: Duranti, A. (ed.) Linguistic Anthropology: A Reader, pp. 272–289. Wiley-Blackwell, Malden (2009)
Anagnostopoulos, A., Kumar, R., Mahdian, M.: Influence and correlation in social networks. In: Proceedings of Knowledge Discovery and Data Mining (KDD), pp. 7–15 (2008)
Androutsopoulos, J.: Language change and digital media: a review of conceptions and evidence. In: Coupland, N., Kristiansen, T. (eds.) Standard Languages and Language Standards in a Changing Europe. Novus, Oslo (2011)
Anis, J.: Neography: unconventional spelling in French SMS text messages. In: Danet, B., Herring, S.C. (eds.) The Multilingual Internet: Language, Culture, and Communication Online, pp. 87–115. Oxford University Press, Oxford (2007)
Backstrom, L., Sun, E., Marlow, C.: Find me if you can: improving geographical prediction with social and spatial proximity. In: Proceedings of the Conference on World-Wide Web (WWW), pp. 61–70 (2010)
Bakshy, E., Rosenn, I., Marlow, C., Adamic, L.: The role of social networks in information diffusion. In: Proceedings of the Conference on World-Wide Web (WWW), Lyon, France, pp. 519–528 (2012)
Baldwin, T., Cook, P., Lui, M., MacKinlay, A., Wang, L.: How noisy social media text, how diffrnt social media sources. In: Proceedings of the 6th International Joint Conference on Natural Language Processing (IJCNLP 2013), pp. 356–364 (2013)
Benjamini, Y., Hochberg, Y.: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. Roy. Stat. Soc. Ser. B (Methodol.) 57(1), 289–300 (1995)
Bucholtz, M., Hall, K.: Identity and interaction: a sociocultural linguistic approach. Discourse Stud. 7(4–5), 585–614 (2005)
Bucholtz, M., Bermudez, N., Fung, V., Edwards, L., Vargas, R.: Hella Nor Cal or totally So Cal? The perceptual dialectology of California. J. Engl. Linguist. 35(4), 325–352 (2007)
Centola, D., Macy, M.: Complex contagions and the weakness of long ties. Am. J. Sociol. 113(3), 702–734 (2007)
Crystal, D.: Language and the Internet, 2nd edn. Cambridge University Press, Cambridge (2006)
Dunbar, R.I.: Neocortex size as a constraint on group size in primates. J. Hum. Evol. 22(6), 469–493 (1992)
Eckert, P.: Linguistic Variation as Social Practice. Blackwell, Oxford (2000)
Eisenstein, J.: What to do about bad language on the internet. In: Proceedings of the North American Chapter of the Association for Computational Linguistics (NAACL), pp. 359–369 (2013)
Eisenstein, J.: Systematic patterning in phonologically-motivated orthographic variation. J. Sociolinguistics 19, 161–188 (2015)
Eisenstein, J.: Written dialect variation in online social media. In: Boberg, C., Nerbonne, J., Watt, D. (eds.) Handbook of Dialectology. Wiley, Hoboken (2016)
Eisenstein, J., Ahmed, A., Xing, E.P.: Sparse additive generative models of text. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 1041–1048 (2011)
Fagyal, Z., Swarup, S., Escobar, A.M., Gasser, L., Lakkaraju, K.: Centers and peripheries: network roles in language change. Lingua 120(8), 2061–2079 (2010)
Granovetter, M.S.: The strength of weak ties. Am. J. Sociol. 78(6), 1360–1380 (1973)
Green, L.J.: African American English: A Linguistic Introduction. Cambridge University Press, Cambridge (2002)
Griffiths, T.L., Kalish, M.L.: Language evolution by iterated learning with Bayesian agents. Cogn. Sci. 31(3), 441–480 (2007)
Hamilton, W.L., Leskovec, J., Jurafsky, D.: Diachronic word embeddings reveal statistical laws of semantic change. In: Proceedings of the Association for Computational Linguistics (ACL), Berlin (2016)
Hawkes, A.G.: Spectra of some self-exciting and mutually exciting point processes. Biometrika 58(1), 83–90 (1971)
Herring, S.C.: Grammar and electronic communication. In: Chapelle, C.A. (ed.) The Encyclopedia of Applied Linguistics. Wiley, Hoboken (2012)
Huberman, B., Romero, D.M., Wu, F.: Social networks that matter: Twitter under the microscope. First Monday 14(1) (2008)
Johnstone, B., Bhasin, N., Wittkofski, D.: “Dahntahn” Pittsburgh: monophthongal /aw/ and representations of localness in Southwestern Pennsylvania. Am. Speech 77(2), 148–176 (2002)
Kulkarni, V., Al-Rfou, R., Perozzi, B., Skiena, S.: Statistically significant detection of linguistic change. In: Proceedings of the Conference on World-Wide Web (WWW), pp. 625–635 (2015)
Kwak, H., Lee, C., Park, H., Moon, S.: What is Twitter, a social network or a news media? In: Proceedings of the Conference on World-Wide Web (WWW), pp. 591–600 (2010)
La Fond, T., Neville, J.: Randomization tests for distinguishing social influence and homophily effects. In: Proceedings of the Conference on World-Wide Web (WWW), pp. 601–610 (2010)
Labov, W.: The social motivation of a sound change. Word 19(3), 273–309 (1963)
Labov, W.: Principles of Linguistic Change, vol. 2: Social Factors, vol. 2. Wiley-Blackwell, Hoboken (2001)
Labov, W.: Review of linguistic variation as social practice, by Penelope Eckert. Lang. Soc. 31, 277–284 (2002)
Labov, W.: Principles of Linguistic Change, vol. 3: Cognitive and Cultural Factors, vol. 3. Wiley-Blackwell, Hoboken (2011)
Latour, B., Woolgar, S.: Laboratory Life: The Construction of Scientific Facts. Princeton University Press, Princeton (2013)
Li, L., Deng, H., Dong, A., Chang, Y., Zha, H.: Identifying and labeling search tasks via query-based Hawkes processes. In: Proceedings of Knowledge Discovery and Data Mining (KDD), pp. 731–740 (2014)
Li, L., Zha, H.: Learning parametric models for social infectivity in multi-dimensional Hawkes processes. In: Proceedings of the National Conference on Artificial Intelligence (AAAI) (2015)
Milroy, L., Milroy, J.: Social network and social class: toward an integrated sociolinguistic model. Lang. Soc. 21(01), 1–26 (1992)
Niyogi, P., Berwick, R.C.: A dynamical systems model for language change. Complex Syst. 11(3), 161–204 (1997)
Ogata, Y.: On Lewis’ simulation method for point processes. IEEE Trans. Inf. Theor. 27(1), 23–31 (1981)
Pavalanathan, U., Eisenstein, J.: Audience-modulated variation in online social media. Am. Speech 90(2), 187–213 (2015)
Pavalanathan, U., Eisenstein, J.: Confounds and consequences in geotagged Twitter data. In: Proceedings of Empirical Methods for Natural Language Processing (EMNLP), September 2015
Rickford, J.R.: Geographical diversity, residential segregation, and the vitality of African American vernacular English and its speakers. Transform. Anthropol. 18(1), 28–34 (2010)
Sadilek, A., Kautz, H., Bigham, J.P.: Finding your friends and following them to where you are. In: Proceedings of the Conference on Web Search and Data Mining (WSDM), pp. 723–732 (2012)
Squires, L.: Enregistering internet language. Lang. Soc. 39, 457–492 (2010)
Tagliamonte, S.A., Denis, D.: Linguistic ruin? LOL! Instant messaging and teen language. Am. Speech 83(1), 3–34 (2008)
Trudgill, P.: Sex, covert prestige and linguistic change in the urban British English of Norwich. Lang. Soc. 1(2), 179–195 (1972)
Wolfram, W.: The linguistic variable: fact and fantasy. Am. Speech 66(1), 22–32 (1991)
Zhao, Q., Erdogdu, M.A., He, H.Y., Rajaraman, A., Leskovec, J.: Seismic: a self-exciting point process model for predicting tweet popularity. In: Proceedings of Knowledge Discovery and Data Mining (KDD), pp. 1513–1522 (2015)
Acknowledgments
Thanks to the reviewers for their feedback, to Márton Karsai for suggesting the infection risk analysis, and to Le Song for discussing Hawkes processes. John Paparrizos is an Alexander S. Onassis Foundation Scholar. This research was supported by the National Science Foundation under awards IIS-1111142 and RI-1452443, by the National Institutes of Health under award number R01-GM112697-01, and by the Air Force Office of Scientific Research.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Goel, R. et al. (2016). The Social Dynamics of Language Change in Online Networks. In: Spiro, E., Ahn, YY. (eds) Social Informatics. SocInfo 2016. Lecture Notes in Computer Science(), vol 10046. Springer, Cham. https://doi.org/10.1007/978-3-319-47880-7_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-47880-7_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-47879-1
Online ISBN: 978-3-319-47880-7
eBook Packages: Computer ScienceComputer Science (R0)