The “New Statistics” on the Vocabulary Level

  • Gustav Herdan
Part of the Kommunikation und Kybernetik in Einzeldarstellungen book series (COMMUNICATION, volume 4)


Our analysis makes us see the universe of vocabulary in a new light. As we know, the parameters used for characterising a typical statistical universe, such as the Mean and the Standard Deviation, are no use for the frequency distribution of vocabulary, because they change with sample size. However, for the universe of vocabulary it was found that the coefficient of variation of the mean, v m , was sensibly independent of the sample size, and therefore could serve the purpose of characterising the vocabulary distribution. The corresponding parameter in the universe of vocabulary is the Repeat Rate, as whose sampling value we must regard v m 2 .


Repeat Rate Black Body Radiation Vocabulary Item Alternative Probability Text Length 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    Saussure, F. de: Cours de Linguistique Générale. Paris 1916.Google Scholar
  2. [2]
    Bohm, H.: Struktur und Element, Studium Generale 6, 535 (1955).Google Scholar
  3. [3]
    Nasvytis, A.: Die Gesetzmäßigkeiten kombinatorischer Technik. Berlin-Göttingen-Heidelberg 1954.Google Scholar
  4. [4]
    Dewey, G.: Relative frequencies of English speech sounds. Harvard 1923.Google Scholar
  5. [5]
    Boldrini, M: Le statistiche letterarie e i fonemi elementan nella poesia. Milano 1948.Google Scholar
  6. [6]
    French, N. R., C. W. Carter jr., and W. Koenig jr.: Words and sounds of Telephone Conversations, Bell System Technical Journal 9, 1930.Google Scholar
  7. [7]
    Mathesius, V.: Čeština a obecny jazykozpyt. Praha 1947.Google Scholar
  8. [8]
    Uhlenbeck, E. M.: De structur van het Javaanse morpheem. Bandoeng 1949.Google Scholar
  9. [9]
    Ross, A. C. S.: Philological Probability Problems, J. R. S. S. XII, 19 (1952).Google Scholar
  10. [10]
    Herdan, G.: The patterning of Semitic verbal roots, Word 18, 262–6 (1962).Google Scholar
  11. [11]
    Moreau, R.: Sur la distribution des formes verbales dans le français écrit, Etudes de Linguistique Appliquée 2, Paris (1963).Google Scholar
  12. [12]
    Herdan, G.: Biometrika 45, 222–228 (1958).Google Scholar
  13. [13]
    Drobisch, W. M.: Ber. d. Verhandlungen d. Königl. Sächsischen Gesellschaft der Wissenschaften zu Leipzig, Philol.-Hist. Klasse 18, 75–139 (1866).Google Scholar
  14. [14]
    Boldrini, M.: Esametri. Contributi del Laboratorio Statistico, Serie Sesta, dell’Universita Cattolicà del Sacro Cuore XXI, 76–80 Milano (1948).Google Scholar
  15. [15]
    Herban, G.: Informationstheoretische Analyse als Werkzeug d. Sprachforschung, Die Naturwissenschaften 41 (1954).Google Scholar
  16. [16]
    Porter, H. N.: Yale Class. Studies XII (1951).Google Scholar
  17. [17]
    Hartel, W.: Homerische Studien I, 94–97, Berlin (1873).Google Scholar
  18. [18]
    Pacioli, L.: De divina proportione. Venezia, 1509.Google Scholar
  19. [19]
    Cattaneo, P.: Generalizzazione della successione di Fibonacci, Boll. Union e Mathem. Ital. (1943).Google Scholar
  20. [20]
    Duckworth, G.: Mathematical proportion in Vergil’s Aeneid, Univ. Michigan. 1962.Google Scholar
  21. [21]
    Yule, G. U.: The Statistical Study of Literary Vocabulary. Cambridge 1944.Google Scholar
  22. [22]
    Landau, L. and E. Lifshitz,: Statistical Physics. Oxford 1938.Google Scholar
  23. [23]
    Herdan, G.: Type-Token Mathematics. The Hague 1960 pp. 121–123, 341–412.Google Scholar
  24. [24]
    Harrison, P. N.: The Problem of the Pastoral Epistles. Oxford 1921.Google Scholar
  25. [25]
    Morgenthaler, Broquist, H. P., and A. M. Albrecht: Pteridines and the nutrition of the Protozoon Crithidia fasciculata. Proc. Soc. exp. Biol. (N. Y.) 89, 178–180 (1955).: Statistik des Neutestamentlichen Wortschatzes. Zürich-Frankfurt am Main 1958.Google Scholar
  26. [26]
    Chadwick, H.: The Listener LXXI, No. 1819 (1964).Google Scholar
  27. [27]
    Muller, Ch.: Essai de Statistique Lexicale, L’Illusion Comique de Pierre Corneille. Paris 1964.Google Scholar
  28. [28]
    Whittaker, E.: From Euclid to Eddington. Cambridge 1949.Google Scholar
  29. [29]
    Cajori, F.: A History of Physics. New York 1929.Google Scholar
  30. [30]
    Herdan, G.: The Calculus of Linguistic Observations, Chaps. 16–18. The Hague 1962.Google Scholar
  31. [31]
    Greenberg, J. H.: Patterning of Semitic Verbal Roots, Word. 6 (1950) 160.Google Scholar
  32. [32]
    W. Meyer-Eppler, Grundlagen und Anwendungen der Informationstheorie, Berlin-Göttingen-Heidelberg 1959.Google Scholar

Copyright information

© Springer-Verlag Berlin · Heidelberg 1966

Authors and Affiliations

  • Gustav Herdan
    • 1
  1. 1.University of BristolUSA

Personalised recommendations