Skip to main content

Testing the Robustness of Laws of Polysemy and Brevity Versus Frequency

  • Conference paper
  • First Online:
Statistical Language and Speech Processing (SLSP 2016)

Abstract

The pioneering research of G.K. Zipf on the relationship between word frequency and other word features led to the formulation of various linguistic laws. Here we focus on a couple of them: the meaning-frequency law, i.e. the tendency of more frequent words to be more polysemous, and the law of abbreviation, i.e. the tendency of more frequent words to be shorter. Here we evaluate the robustness of these laws in contexts where they have not been explored yet to our knowledge. The recovery of the laws again in new conditions provides support for the hypothesis that they originate from abstract mechanisms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 34.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 44.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://multisemcor.fbk.eu/semcor.php.

References

  1. Altmann, E.G., Gerlach, M.: Statistical laws in linguistics. In: Degli Esposti, M., Altmann, E.G., Pachet, F. (eds.) Creativity and Universality in Language. Lecture Notes in Morphogenesis, pp. 7–26. Springer International Publishing, Cham (2016). http://dx.doi.org/10.1007/978-3-319-24403-7_2

    Chapter  Google Scholar 

  2. Baayen, R.H.: Analyzing Linguistic Data: A Practical Introduction to Statistics Using R. Cambridge University Press, Cambridge (2007)

    Google Scholar 

  3. Baayen, R.H., Piepenbrock, R., Gulikers, L.: CELEX2, LDC96L14. Philadelphia: Linguistic Data Consortium (1995). https://catalog.ldc.upenn.edu/LDC96L14. Accessed 10 Apr 2016

  4. Baixeries, J., Elvevåg, B., Ferrer-i-Cancho, R.: The evolution of the exponent of Zipf’s law in language ontogeny. PLoS ONE 8(3), e53227 (2013)

    Article  Google Scholar 

  5. Corral, A., Boleda, G., Ferrer-i Cancho, R.: Zipf’s law for word frequencies: word forms versus lemmas in long texts. PLoS ONE 10(7), 1–23 (2015)

    Article  Google Scholar 

  6. Fellbaum, C.: WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)

    MATH  Google Scholar 

  7. Fenk-Oczlon, G., Fenk, A.: Frequency effects on the emergence of polysemy and homophony. Int. J. Inf. Technol. Knowl. 4(2), 103–109 (2010)

    Google Scholar 

  8. Ferrer-i-Cancho, R., Hernández-Fernández, A., Lusseau, D., Agoramoorthy, G., Hsu, M.J., Semple, S.: Compression as a universal principle of animal behavior. Cogn. Sci. 37(8), 1565–1578 (2013)

    Article  Google Scholar 

  9. Font-Clos, F., Boleda, G., Corral, A.: A scaling law beyond Zipf’s law and its relation to Heaps’ law. New J. Phys. 15(9), 093033 (2013). http://stacks.iop.org/1367-2630/15/i=9/a=093033

    Article  Google Scholar 

  10. Gonzalez Torre, I., Luque, B., Lacasa, L., Luque, J., Hernandez-Fernandez, A.: Emergence of linguistic laws in human voice (2016, in preparation)

    Google Scholar 

  11. Grefenstette, G.: Extracting weighted language lexicons from wikipedia. In: Chair, N.C.C., Choukri, K., Declerck, T., Goggi, S., Grobelnik, M., Maegaard, B., Mariani, J., Mazo, H., Moreno, A., Odijk, J., Piperidis, S. (eds.) Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016). European Language Resources Association (ELRA), Paris, France, May 2016

    Google Scholar 

  12. Ide, N., Wilks, Y.: Making sense about sense. In: Agirre, E., Edmonds, P. (eds.) Word Sense Disambiguation: Algorithms and Applications. Text, Speech and Language Technology, vol. 33, pp. 47–73. Springer, Dordrecht (2006). http://dx.doi.org/10.1007/978-1-4020-4809-8_3

    Chapter  Google Scholar 

  13. Jespersen, O.: Monosyllabism in English. Biennial lecture on English philology / British Academy. H. Milford publisher, London (1929). Reprinted in: Linguistica: Selected Writings of Otto Jespersen, pp. 574–598. George Allen and Unwin LTD, London (2007)

    Google Scholar 

  14. Ke, J.: A cross-linguistic quantitative study of homophony. J. Quant. Linguist. 13, 129–159 (2006)

    Article  Google Scholar 

  15. Kilgarriff, A.: Dictionary word sense distinctions: an enquiry into their nature. Comput. Humanit. 26(5), 365–387 (1992). http://dx.doi.org/10.1007/BF00136981

    Article  Google Scholar 

  16. MacWhinney, B.: The CHILDES Project: Tools for Analyzing Talk: The Database, vol. 2, 3rd edn. Lawrence Erlbaum Associates, Mahwah (2000)

    Google Scholar 

  17. Newson, R.: Parameters behind nonparametric statistics: Kendall’s tau, Somers’D and median differences. Stata J. 2(1), 45–64 (2002)

    Google Scholar 

  18. Razavi, M., Rasipuram, R., Magimai-Doss, M.: Acoustic data-driven grapheme-to-phoneme conversion in the probabilistic lexical modeling framework. Speech Commun. 80, 1–21 (2016)

    Article  Google Scholar 

  19. Zipf, G.K.: The meaning-frequency relationship of words. J. Gen. Psychol. 1945(33), 251–256 (1945)

    Article  Google Scholar 

  20. Zipf, G.K.: Human Behaviour and the Principle of Least Effort. Addison-Wesley, Cambridge (1949)

    Google Scholar 

  21. Zipf, G.K.: The Psycho-Biology of Language: An Introduction to Dynamic Psychology. MIT Press, Cambridge (1968). Originally published in 1935 by Houghton Mifflin, Boston, MA, USA

    Google Scholar 

Download references

Acknowledgments

The authors thank Pedro Delicado and the reviewers for their helpful comments. This research work has been supported by the SGR2014-890 (MACDA) project of the Generalitat de Catalunya, and MINECO project APCOM (TIN2014-57226-P) from Ministerio de Economía y Competitividad, Spanish Government.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Antoni Hernández-Fernández .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Hernández-Fernández, A., Casas, B., Ferrer-i-Cancho, R., Baixeries, J. (2016). Testing the Robustness of Laws of Polysemy and Brevity Versus Frequency. In: Král, P., Martín-Vide, C. (eds) Statistical Language and Speech Processing. SLSP 2016. Lecture Notes in Computer Science(), vol 9918. Springer, Cham. https://doi.org/10.1007/978-3-319-45925-7_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-45925-7_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-45924-0

  • Online ISBN: 978-3-319-45925-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics