Simple Estimation for Categorical Data

Rudas, Tamás

doi:10.1007/978-1-4939-7693-5_4

Tamás Rudas^5,6

Part of the book series: Springer Texts in Statistics ((STS))

3081 Accesses

Abstract

This chapter summarizes several simple procedures often used in the analysis of categorical data. These include maximum likelihood estimation of parameters of binomial, multinomial, and Poisson distributions and also unbiased estimation with unequal selection probabilities. The Lagrange multiplier method is introduced, and maximum likelihood estimation in general parametric models is considered. In addition to the usual formula for the standard error of an estimated probability, the δ-method is used to derive asymptotic standard errors for estimates of more complex quantities, which are routinely reported in surveys. Standard errors of estimates of fractions based on stratified samples are compared to standard errors obtained from simple random samples.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Hardcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
In a categorical setup, the likelihood of the sample is the same as its probability. The word likelihood is used to also refer to situations where all observations have zero probability (as with continuous random variables) but different likelihoods (as, e.g., with a normal distribution).
2.
In a more general setting, the density.
3.
For the time being, it is assumed that this is possible, that is, p > 0. Section 4.1.1 gives a more detailed discussion that includes the case of p = 0 and that also applies here.
4.
Later on, maximum likelihood estimates under statistical models defined in terms of restrictions on p will be determined. In those cases, the additional restrictions implied by the model need to be imposed, too.
5.
Avoiding self-selection of the respondents and reducing the effects of nonresponse and other kinds of missing data are major problems of survey methodology. Also, much of the published statistical analyses of survey data disregard the peculiarities of the sampling procedure and work as if the sampling distribution was multinomial or Poisson. In reality, most of the nationwide surveys use complex sampling procedures that often include stratification and multistage selection. The goal of applying these procedures may be to reduce the data collection cost per respondent or to incorporate information about the population to reduce the standard deviations of estimates. There are many good books on survey sampling; (41) and (54) are recommended in particular.
6.
Note that some authors distinguish between standard deviation, which is a parameter associated with a random variable, and standard error, which is the same parameter associated with a quantity determined from a sample. Such a strict distinction is not made in this book.
7.
Care should be taken not to interpret the margin of error as if it was an absolute bound on the magnitude of the possible error of the estimate.
8.
More precisely, the margin of error addresses only the size of the so-called sampling error, that is, the difference between the estimate from the survey and the value that would be obtained if the methods of the survey were used to carry out a census. In a census, data are collected from the entire population; no sampling occurs. In most cases, however, even the value obtained from the census may be different from the true value. For example, respondents may not remember or do not want to tell the truth, or any other kind of measurement error may occur. The difference between the census value and the true value is called the nonsampling error of the survey.
9.
In many countries, a party has to receive at least 5% of the votes that were actually cast to get into the parliament. For this, and other reasons, the number of seats in the parliament may not be a linear function of the fraction of votes received.

References

Bertsekas, D.P.: Constrained Optimization and Lagrange Multiplier Methods. Academic Press, New York (1982)
Google Scholar
Hansen, M.H., Hurwitz, W.N. Madow, W.G.: Sample Survey Methods and Theory, Volumes I and II. Wiley, New York (1993)
Google Scholar
Kish, L.: Survey Sampling. Wiley, New York (1995)
Google Scholar
Lohr, S.L.: Sampling: Design and Analysis. Brooks/Cole, Boston (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Center for Social Sciences, Hungarian Academy of Sciences, Budapest, Hungary
Tamás Rudas
Eötvös Loránd University, Budapest, Hungary
Tamás Rudas

Authors

Tamás Rudas
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Rudas, T. (2018). Simple Estimation for Categorical Data. In: Lectures on Categorical Data Analysis. Springer Texts in Statistics. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-7693-5_4

Download citation

DOI: https://doi.org/10.1007/978-1-4939-7693-5_4
Published: 31 March 2018
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4939-7691-1
Online ISBN: 978-1-4939-7693-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics