# Theory of Estimation

• Leopold Schmetterer
Part of the Die Grundlehren der mathematischen Wissenschaften book series (GL, volume 202)

## Abstract

We first superficially sketch the problem we will treat in this chapter. In the previous chapter we dealt with the question of how one can acquire more precise information on the value of an unknown parameter on the basis of a sample. Although one tries to construct confidence sets which are “as small as possible”, one cannot be guided in such a construction by the idea of “exactly” determining the parameter. To work out this concept is the goal of the theory of estimation. If $$(R,S)$$ is a sample space and Γ a set of parameters of a class of probability measures P Γover $$(R,S)$$ then one seeks a map h of R into Γ such that h(χ) for a sample χ∈R is “approximately” equal to the true parameter value. We are primarily concerned with the case in which Γ is a subset of R1 or where we have to estimate a mapping d from Γ into R1. For the sake of simplified formulation we agree that: Γ will always be a non-empty set of parameters of a class of probability measures and d a map from Γ into R 1, unless something else is specifically said. Further conditions can be also imposed on Γ as well as d.

## Keywords

Probability Measure Unbiased Estimate Sample Space Asymptotic Variance Conditional Density
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

## References

1. 1.
This concept was given its first clear treatment in F. N. David and J. Neyman, Statist. Res. Mem. Univ. London 2,105–116 (1938).Google Scholar
2. 2.
See also Theorem 1 in H. Teicher, Ann. Math. Statist. 34, 1265–1269 (1963).
3. 3.
Due to A.N. Kolmogorov, Izv. Akad. Nauk SSSR Ser. Mat. 14, 303–326 (1950).
4. 4.
This result is essentially due to C. R. Rao, Sankhya 12, 27–42 (1952).
5. 5.
See L. Schmetterer, Ann. Math. Statist. 31, 1154–1163 (1960) and Publ. Math. Inst. Hungar. Acad. Sci. Ser. A. 6, 295–300 (1961). See also A. Kramli, Studia Sci. Math. Hung. 2, 159–161 (1967).
6. 6.
7. 7.
See L. Schmetterer, Mitteilungsbl. Math. Statist. 9, 147–152 (1957).
8. 8.
See E. W. Barankin, Ann. Math. Statist. 20, 477–501 (1949) and L. Schmetterer, loc. cit.5.
9. 9.
R. R. Bahadur, Sankhya 18, 211–224 (1957).
10. 10.
11. 11.
D. Blackwell, Ann. Math. Statist. 18, 105–110 (1947). See also A.N. Kolmogorov, loc. cit.3.
12. 12.
For this terminology see III, p. 206.Google Scholar
13. 13.
Here and in the following lines we have Occasionally suppressed the reference to γ.Google Scholar
14. 14.
E. L. Lehmann and H. Scheffe, Sankhya 10, 305–340 (1950).
15. 15.
More precisely, we are speaking here of absolute moments. Thus, we consider $$E(|h - d(\gamma ){|^p};\gamma ).$$ Google Scholar
16. 16.
17. 17.
See L. Schmetterer, Abh. Deutsch. Akad. Wiss. Berlin, Kl. Math. Phys., Tech. 1964, Nr. 4, 117–120 and J. Roy and LM. Chakravarti, Ann. Math. Statist. 31, 392–398 (1960).Google Scholar
18. 18.
From the sampling theory standpoint the case in which all components of γ appear in xγ(i) is trivial.Google Scholar
19. 19.
The definition of the cj, 1⩽j⩽N in the elements of R that are different from xγ(i) is of no consequence.Google Scholar
20. 20.
We naturally assume that this set is non-empty.Google Scholar
21. 21.
Theorem 1.1 referred, however, to the totality of all unbiased estimates and not just to the linear estimates. The theorem holds trivially for linear estimates h if one limits the class of estimates of zero to the linear ones.Google Scholar
22. 22.
See M. Frechet, Rev. Inst. Internat. Statist. 11,182–205 (1943), C. R. Rao, Bull. Calcutta Math. Soc. 37, 81–91 (1945) and H. Cramer, Skand. Aktuarietidskr. 29, 85–94 (1946).
23. 23.
One can replace this assumption by the requirement that /(x, y)=0 for each y in a set independent of y.Google Scholar
24. 24.
This result is of course related to I, Theorem 17.1.Google Scholar
25. 25.
26. 26.
R. R. Bahadur loc. cit9 and L. Schmetterer loc. cit.5 Google Scholar
27. 27.
If Γ ⫅ R1 and d(y) = y for all y∈Γ, then we also say that hn is consistent for yeT.Google Scholar
28. 28.
29. 29.
We do not exclude the possibility that N(ε) depends on y.Google Scholar
30. 30.
L. Le Cam and L. Schwartz, Ann. Math. Statist. 31,140–150 (1960). See also J. L. Doob, Colloques internationaux Centre National de la Recherche Scientifique, no. 13, 23–27 Centre National de la Recherche Scientifique, Paris 1949.Google Scholar
31. 31.
R. A. Fisher, Messenger of Math. 41, 150–160 (1912).Google Scholar
32. 32.
For the following one can replace the assumption f(x,y)≠0 for all x∈R1 by f(x,y)∈0 for all x∈R1 up to a ju-null set without difficulty. Moreover, R1 can also be replaced by an arbitrary Borel set M not depending on y, i.e., it is sufficient to require f(x,y)∈0 for each γ∈Γ and all x eM and f(x,y)=0 for each γ∈Γ and all x∈R1-M.Google Scholar
33. 33.
One can again allow exceptional sets of μ-measure zero here.Google Scholar
34. 34.
N(δ,ε) may also depend on γ but we suppress this. This will pertain to analogous statements in the following proof.Google Scholar
35. 35.
“0” is here the k-dimensional zero vector.Google Scholar
36. 36.
This proof is somewhat related to that of H. Cramer, Loc. cit. I,58 500ff. For the case considered here of a multi-dimensional parameter, K. C. Chanda, Biometrika 41, 56–61 (1954) has carried out Cramer’s proof in detail.
37. 37.
We have modified here and in the sequel an argument of H. Hornich, Monatsh. Math. 54, 130–134 (1950).
38. 38.
See V.S. Huzurbazar, Ann. Eugenics 14, 185–200 (1948).
39. 39.
A more precise formulation can be given with the help of the statement of Theorem 3.8.Google Scholar
40. 40.
See L. Le Cam and Ch. Kraft, Ann. Math. Statist. 27, 1174–1177 (1956) and also R. R. Bahadur, Sankhya 20, 207–210 (1958).
41. 41.
See p. 192. One can for example choose the set of all /c-tuples in C with rational components. Note that only the existence of a dense set in C is used in the proof and not the compactness of C.Google Scholar
42. 42.
Cf. H. Richter, Math. Ann. 150, 85–90 and 440-441 (1963), M. Sion, Trans. Amer. Math. Soc. 96, 237–246 (1960).
43. 43.
A. Wald, Ann. Math. Statist. 20, 595–601 (1949), J. Wolfowitz, Ann. Math. Statist. 20, 601–602 (1949). See also J.L. Doob, Trans. Amer. Math. Soc. 36, 759–775 (1934).
44. 44.
Γ can in principle be any open set.Google Scholar
45. 45.
It is enough to require that the mapping be continuous for all x up to a μ-null set.Google Scholar
46. 46.
For each γ∈Γ there is a sufficiently small p such that Kp(y)⊂Γ.Google Scholar
47. 47.
We do not exclude the possibility that n(ε, δ) depends on γ.Google Scholar
48. 48.
One sees immediately that these assumptions can be generalized, Γ can be an arbitrary open set and Γ✶ an open, bounded subset of Γ with closure belonging to Γ. The closure is the smallest closed set containing Γ✶, i.e., the intersection of all closed subsets of Rk containing Γ✶.Google Scholar
49. 49.
See f.i. J. Pfanzagl, Metrika 14, 249–272 (1969).
50. 50.
A careful analysis of the assumptions used to prove asymptotic normality of ML estimates is given by L. Le Cam, Ann. Math. Statist. 41, 802–828 (1970). See also P. J. Huber, Proc. 5 th. Berkeley Sympos. Math. Stat. Probab. Vol. I, 221–233 (1967).
51. 51.
All these statements are related to probabilities w.r.t. γ.Google Scholar
52. 52.
0 represents here the k-dimensional null vector.Google Scholar
53. 53.
Actually, we have suppressed a step here since (I + Cn(n))-1 is only defined with probability arbitrarily close to 1 for sufficiently large n.Google Scholar
54. 54.
See48 for remarks on this formulation.Google Scholar
55. 55.
This follows from the additional continuity assumption and (3.3). Thus, γ→|E(A(ξγ);γ)|(possesses a positive lower bound in Γ0).Google Scholar
56. 56.
L. Le Cam, Univ. California Publ. Statist. 1, 277–329 (1953) and l.c.50.
57. 57.
For the notation see Theorem 3.6.Google Scholar
58. 58.
Cf. L, Le Cam, loc. cit. 56. See also P.J. Bickel and J.A. Yahav, Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 11, 257–276 (1969) and the references given there.
59. 59.
See I, p.59. The map γ→f(x γ) is thus defined up to a null set (which does not depend on x).Google Scholar
60. 60.
Possibly with the exception of L-null sets.Google Scholar
61. 61.
We use the notation of Theorem 3.1 although the context is somewhat different.Google Scholar
62. 62.
Kp(γ) has the same meaning as in Theorem 3.6.Google Scholar
63. 63.
We allow the possibility that the expectation on the left is + ∞.Google Scholar
64. 64.
For the meaning of Pyo.∞ see the definition of Pyo.∞ on p. 314.Google Scholar
65. 65.
66. 66.
L. Le Cam, loc. cit. 56.Google Scholar
67. 67.
For the notation see p. 296. Also cf.61.Google Scholar
68. 68.
B. van der Waerden, Ber. Verh. sachs. Akad. Leipzig, Math.-Phys. Kl. 87, 353–364(1935).Google Scholar
69. 69.
70. 70.
The theory of Bayes estimates has been carefully studied recently. We only mention: M.H. de Groot and M.M. Rao, Ann. Math. Statist. 34, 598–611 (1963) and L. Schwartz, Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 4, 10–26 (1965).
71. 71.
L. Le Cam, loc. cit. 56.Google Scholar
72. 72.
73. 73.
See, for example, R.A. Fisher, Philos. Trans. Roy. Soc. London Ser. A 222, 309–368 (1922) and R.A. Fisher, Proc. Cambridge Philos. Soc. 22, 700–725 (1925).
74. 74.
75. 75.
See in connection with the entire question C. R. Rao, J. Roy. Statist. Soc. Ser. B, 24, 46–72 (1962). Proc. Fourth Berkeley Sympos. Math. Statist, and Prob. Vol. I, pp. 531–545, Univ. California Press, Berkeley, Calif., (1960) and Sankhya 24, Ser. A, 73–101 (1962), as well as Sankhya 25, Ser. A, 189–206 (1963).Google Scholar
76. 76.
J. Neyman, Proceedings of the Berkeley Symposium on Mathematical Statistics and Probability pp. 239–273, (1949), University of California Press, Berkeley and Los Angeles.Google Scholar
77. 77.
R. R. Bahadur, Sankhya 22, 229–253 (1960).
78. 78.
See D. Basu, Sankhya 17, 193–196 (1956) as well as E. L. Lehmann, Proceedings of the Berkeley Symposium on Mathematical Statistics and Probability 1949, pp. 451–457, University of California Press, Berkeley and Los Angeles where, in a somewhat different connection, a similar example is considered.Google Scholar
79. 79.
See L. Schmetterer, Research Papers in Statistics (Neyman Festschrift, 301–317) John Wiley, New York 1966.Google Scholar