Asymptotically Normal Estimators for Zipf’s Law
We study an infinite urn scheme with probabilities corresponding to a power function. Urns here represent words from an infinitely large vocabulary. We propose asymptotically normal estimators of the exponent of the power function. The estimators use the number of different elements and a few similar statistics. If we use only one of the statistics we need to know asymptotics of a normalizing constant (a function of a parameter). All the estimators are implicit in this case. If we use two statistics then the estimators are explicit, but their rates of convergence are lower than those for estimators with the known normalizing constant.
Keywords and phrases.Infinite urn scheme Zipf’s law Asymptotic normality.
AMS (2000) subject classification.Primary 62F10; Secondary 62F12
Unable to display preview. Download preview PDF.
Our research was partially supported by RFBR grant 17-01-00683 and by the program of fundamental scientific researches of the SB RAS No. I.1.3., project No. 0314-2016-0008.
- Ben-Hamou, A., Boucheron, S. and Gassiat, E. (2016). Pattern coding meets censoring: (almost) adaptive coding on countable alphabets. arXiv:1608.08367.
- Heaps, H.S. (1978). Information retrieval, computational and theoretical aspects. Academic Press.Google Scholar
- Mandelbrot, B. (1965). Information theory and psycholinguistics. In Scientific psychology. Basic Books, (B.B. Wolman and E. Nagel, eds.)Google Scholar
- Ohannessian, M.I. and Dahleh, M.A. (2012). Rare probability estimation under regularly varying heavy tails. In Proceedings of the 25th Annual Conference on Learning Theory PMLR, pp. 23:21.1–21.24.Google Scholar
- Petersen, A.M., Tenenbaum, J.N., Havlin, S., Stanley, H.E. and Perc, M. (2012). Languages cool as they expand: allometric scaling and the decreasing need for new words. Scientific Reports 2. Article No 943.Google Scholar
- Zipf, G.K. (1949). Human behavior and the principle of least effort. University Press, Cambridge.Google Scholar