Introduction to Probability Theory and Statistics

  • Gang Zheng
  • Yaning Yang
  • Xiaofeng Zhu
  • Robert C. Elston
Part of the Statistics for Biology and Health book series (SBH)


Basic probability theory and statistical models and procedures for the analysis of genetic studies are covered in Chap. 1. This chapter starts with an introduction to basic distribution theory and common distributions that are used in the book, including the uniform, multinomial, normal, t-, F-, Beta, Gamma, chi-squared and hypergeometric distributions. The basic distributions for order statistics are also given. Several types of stochastic convergence used in the book are summarized. Maximum likelihood estimation and its large sample properties are discussed. Various tests, including the efficient Score test, likelihood ratio test and Wald test, are studied with or without nuisance parameters. Multiple testing issues related to testing association with multiple genetic markers and related to hypothesis testing with an unknown genetic model are briefly reviewed. This chapter also covers the Delta method, the EM algorithm, basic concepts of sample size and power calculations, and asymptotic relative efficiency.


Probability Density Function Maximum Likelihood Estimate Nuisance Parameter Multivariate Normal Distribution Joint Probability Density Function 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 14.
    Benjamini, Y., Hochberg, Y.: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. Roy. Stat. Soc. Ser. B 57, 289–300 (1995) MATHMathSciNetGoogle Scholar
  2. 15.
    Benjamini, Y., Yekutieli, D.: The control of the false discovery rate in multiple testing under dependency. Ann. Stat. 29, 1165–1188 (2001) CrossRefMATHMathSciNetGoogle Scholar
  3. 27.
    Casella, G., Berger, R.L.: Statistical Inference. Duxbury Press, Belmont (1990) MATHGoogle Scholar
  4. 29.
    Ceppellini, R., Siniscalco, M., Smith, C.A.B.: The estimation of gene frequencies in a random mating population. Ann. Hum. Genet. 20, 97–115 (1955) CrossRefMATHMathSciNetGoogle Scholar
  5. 48.
    Cox, D.R., Hinkley, D.V.: Theoretical Statistics. Chapman & Hall/CRC, Boca Raton (1974) MATHGoogle Scholar
  6. 57.
    David, H.A., Nagaraja, H.N.: Order Statistics. 3rd edn. Wiley, Hoboken (2003) CrossRefMATHGoogle Scholar
  7. 58.
    Dempster, A., Laird, N.M., Rubin, D.: Maximum likelihood from incomplete data via the EM algorithm (with discussion). J. Roy. Stat. Soc. Ser. B 39, 1–38 (1977) MATHMathSciNetGoogle Scholar
  8. 64.
    Dudoit, S., van der Laan, M.J.: Multiple Testing Procedures with Applications to Genomics. Springer, New York (2008) CrossRefMATHGoogle Scholar
  9. 73.
    Elston, R.C., Johnson, W.D.: Basic Biostatistics for Genetists and Epidemiologists. Wiley, West Sussex (2008) Google Scholar
  10. 79.
    Evans, M., Hastings, N., Peacock, B.: Statistical Distributions. 3rd edn. Wiley, New York (2000) MATHGoogle Scholar
  11. 91.
    Freidlin, B., Zheng, G., Li, Z., Gastwirth, J.L.: Trend tests for case-control studies of genetic markers: power, sample size and robustness. Hum. Hered. 53, 146–152 (2002) (Erratum 68, 220 (2009)) CrossRefGoogle Scholar
  12. 95.
    Gastwirth, J.L.: On robust procedures. J. Am. Stat. Assoc. 61, 929–948 (1966) CrossRefMATHMathSciNetGoogle Scholar
  13. 96.
    Gastwirth, J.L.: The use of maximin efficiency robust tests in combining contingency tables and survival analysis. J. Am. Stat. Assoc. 80, 380–384 (1985) CrossRefMATHMathSciNetGoogle Scholar
  14. 97.
    Gastwirth, J.L., Freidlin, B.: On power and efficiency robust linkage tests for affected sibs. Ann. Hum. Genet. 64, 443–453 (2000) CrossRefGoogle Scholar
  15. 156.
    Lachin, J.M.: Biostatistical Methods: The Assessment of Relative Risks. Wiley, New York (2000) CrossRefMATHGoogle Scholar
  16. 198.
    Noether, G.E.: On a theorem of Pitman. Ann. Math. Stat. 26, 64–68 (1955) CrossRefMATHMathSciNetGoogle Scholar
  17. 218.
    Robert, C.P., Casella, G.: Monte Carlo Statistical Methods. Springer, New York (2004) MATHGoogle Scholar
  18. 260.
    Storey, J.D.: A direct approach to false discovery rates. J. Roy. Stat. Soc. Ser. B 64, 479–498 (2002) CrossRefMATHMathSciNetGoogle Scholar
  19. 261.
    Storey, J.D.: The positive false discovery rate: A Bayesian interpretation and the q-value. Ann. Stat. 31, 2013–2035 (2003) CrossRefMATHMathSciNetGoogle Scholar
  20. 279.
    van der Vaart, A.W.: Asymptotic Statistics. Cambridge University Press, Cambridge (1998) MATHGoogle Scholar
  21. 333.
    Zheng, G., Freidlin, B., Gastwirth, J.L.: Robust TDT-type candidate-gene association tests. Ann. Hum. Hered. 66, 145–155 (2002) Google Scholar
  22. 334.
    Zheng, G., Freidlin, B., Gastwirth, J.L.: Comparison of robust tests for genetic association using case-control studies. In: Rojo, J. (ed.) Optimality: The Second Erich L. Lehmann Symposium. Lecture Notes–Monograph Series, vol. 49, pp. 320–336. Institute of Mathematical Statistics, Beachwood (2006) Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  • Gang Zheng
    • 1
  • Yaning Yang
    • 2
  • Xiaofeng Zhu
    • 3
  • Robert C. Elston
    • 3
  1. 1.BethesdaUSA
  2. 2.School of Management, Dept. Statistics & FinanceUniversity of Science & Technology of ChinaHefeiPeople’s Republic of China
  3. 3.School of Medicine, Dept. Epidemiology & BiostatisticsCase Western Reserve UniversityClevelandUSA

Personalised recommendations