Abstract
This chapter treats the problem of point estimation for 1-parameter models. It considers the formalisation of the point estimation problem, including notions to qualify and quantify the accuracy of estimators (mean squared error), and fundamental limitations to this accuracy (Cramér-Rao theorem). It then focusses on maximum likelihood as a central method for estimation, including its asymptotic properties. The chapter concludes with a discussion of the Newton-Raphson iteration and the method of moments.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
More generally, we may define
$$\displaystyle{I_{n}(\theta ) = \mathbb{E}\left [\frac{\partial } {\partial \theta }\log f_{X_{1},\ldots,X_{n}}(X_{1},\mathop{\ldots },X_{n};\theta )\right ]^{2}}$$to be the Fisher information of a sample of size n. In the case of iid random variables, we have \(I_{n}(\theta ) = nI(\theta )\).
- 2.
A consequence of the Cauchy–Schwarz inequality.
- 3.
In the continuous case, a similar interpretation is feasible by considering a small neighbourhood around our sample: since \(F(x +\epsilon /2\,;\theta ) - F(x -\epsilon /2;\theta ) \approx \epsilon f(x;\theta )\) as \(\epsilon \downarrow 0\), we can think of \(\epsilon ^{n}L(\theta )\) as the approximate probability of a square neighbourhood of edge length ε centred around our sample, and viewed as a function of \(\theta\).
- 4.
Recall the inverse function theorem: let \(h(x): \mathbb{R} \rightarrow \mathbb{R}\) be continuously differentiable, with a non-zero derivative at a point \(x_{o} \in \mathbb{R}\). Then, there exists an \(\varepsilon> 0\) such h −1 exists and is continuously differentiable on \((h(x_{0})-\epsilon,h(x_{0})+\epsilon )\), and in fact \((h^{-1})'(y) = [h'(h^{-1}(y))]^{-1}\) for \(\vert y - h(x_{0})\vert <\varepsilon\).
- 5.
Remember: since \(\overline{T}\) is the sum of the iid terms \(T_{1},\ldots,T_{n}\), each satisfying \(\mathrm{Var}(T(X_{i})) =\gamma ''(\phi _{0}) <\infty\) and \(\mathbb{E}[T(X_{i})] =\gamma '(\phi _{0})\), so the central limit theorem implies \(\sqrt{n}(\overline{T} -\gamma '(\phi _{0}))\stackrel{d}{\rightarrow }N(0,\gamma ''(\phi _{0})).\)
- 6.
To see this, use Slutsky’s theorem with \(X_{n} = \sqrt{n}(\tilde{\phi }_{n} -\phi _{0})\), \(Y _{n} = \sqrt{n}(\hat{\phi }_{n} -\tilde{\phi }_{n})\) and the continuous mapping being \((X_{n},Y _{n})\mapsto (X_{n} + Y _{n}\)).
- 7.
Since \(\overline{T}\) is the mean of the iid terms \(T(X_{1}),\ldots,T(X_{n})\), each satisfying \(\mathrm{Var}(T(X_{i})) =\gamma ''(\phi _{0}) <\infty\) and \(\mathbb{E}[T(X_{i})] =\gamma '(\phi _{0})\), the Law of Large Numbers implies that \(\overline{T}\stackrel{p}{\longrightarrow }\gamma '(\phi _{0})\)
Bibliography
Bickel, P. J., & Doksum, K. A. (2001). Mathematical statistics: Basic ideas and selected topics. Upper Saddle River: Prentice Hall.
Durrett, R. (1996). Probability: Theory and examples. Pacific Grove:: Duxbury Press.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Panaretos, V.M. (2016). Point Estimation of Model Parameters. In: Statistics for Mathematicians. Compact Textbooks in Mathematics. Birkhäuser, Cham. https://doi.org/10.1007/978-3-319-28341-8_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-28341-8_3
Published:
Publisher Name: Birkhäuser, Cham
Print ISBN: 978-3-319-28339-5
Online ISBN: 978-3-319-28341-8
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)