Skip to main content

Analytical Methods for Immunogenetic Population Data

  • Protocol
  • First Online:
Book cover Immunogenetics

Part of the book series: Methods in Molecular Biology ((MIMB,volume 882))

Abstract

In this chapter, we describe analyses commonly applied to immunogenetic population data, along with software tools that are currently available to perform those analyses. Where possible, we focus on tools that have been developed specifically for the analysis of highly polymorphic immunogenetic data. These analytical methods serve both as a means to examine the appropriateness of a dataset for testing a specific hypothesis, as well as a means of testing hypotheses. Rather than treat this chapter as a protocol for analyzing any population dataset, each researcher and analyst should first consider their data, the possible analyses, and any available tools in light of the hypothesis being tested. The extent to which the data and analyses are appropriate to each other should be determined before any analyses are performed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. R Core Development Team (2008) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria

    Google Scholar 

  2. Li W, Hu Z, Jiang W (2010) An alphabetic list of genetic analysis software. North Shore LIJ Research Institute. http://www.nslij-genetics.org/soft/. Accessed 7 Oct 2010

  3. Lancaster AK, Single RM, Solberg OD, Nelson MP, Thomson G (2007) PyPop update—a software pipeline for large-scale multilocus population genomics. Tissue Antigens 69(s1):192–197

    Article  PubMed  CAS  Google Scholar 

  4. Solberg OD, Mack SJ, Lancaster AK, Single RM, Tsai Y, Sanchez-Mazas A, Thomson G (2008) Balancing selection and heterogeneity across the classical human leukocyte antigen loci: a meta-analytic review of 497 population studies. Hum Immunol 69(7):443–464

    Article  PubMed  CAS  Google Scholar 

  5. Peakall R, Smouse PE (2006) GENALEX 6: genetic analysis in Excel. Population genetic software for teaching and research. Mol Ecol Notes 6:288–295

    Article  Google Scholar 

  6. Lynch M, Milligan BG (1994) Analysis of population genetic structure with RAPD markers. Mol Ecol 3:91–99

    Article  PubMed  CAS  Google Scholar 

  7. Louis EJ, Dempster ER (1987) An exact test for Hardy-Weinberg and multiple alleles. Biometrics 43(4):805–811

    Article  PubMed  CAS  Google Scholar 

  8. Levene H (1949) On a matching problem arising in genetics. Ann Math Stat 20(1):91–94

    Article  Google Scholar 

  9. Emigh TH (1980) A comparison of tests for Hardy-Weinberg equilibrium. Biometrics 36(4):627–642

    Article  Google Scholar 

  10. Guo SW, Thompson EA (1992) Performing the exact test of Hardy-Weinberg proportion for multiple alleles. Biometrics 48(2):361–372

    Article  PubMed  CAS  Google Scholar 

  11. Huber M, Chen Y, Dinwoodie I, Dobra A, Nicholas M (2006) Monte Carlo algorithms for Hardy-Weinberg proportions. Biometrics 62:49–53

    Article  PubMed  Google Scholar 

  12. Yuan A, Bonney GE (2003) Exact test of Hardy-Weinberg equilibrium by Markov chain Monte Carlo. Math Med Biol 20:327–340

    Article  PubMed  Google Scholar 

  13. Ebrahimi N, Bilgili D (2007) A new method of testing for Hardy-Weinberg equilibrium and ordering populations. J Genet 86:1–7

    Article  PubMed  Google Scholar 

  14. Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller E (1953) Equations of state calculations by fast computing machines. J Chem Phys 21:1087–1092

    Article  CAS  Google Scholar 

  15. Chen JJ, Hollenbach JA, Trachtenberg EA, Just JJ, Carrington M, Rønningen KS, Begovich A, King MC, McWeeney SK, Mack SJ, Erlich HA, Thomson G (1999) Hardy-Weinberg testing for HLA class II (DRB1, DQA1, DQB1 and DPB1) loci in 26 human ethnic groups. Tissue Antigens 54:533–542

    Article  PubMed  CAS  Google Scholar 

  16. Hernández JL, Weir BS (1989) A disequilibrium coefficient approach to Hardy-Weinberg testing. Biometrics 45(1):53–70

    Article  PubMed  Google Scholar 

  17. Chen JJ, Thomson G (1999) The variance for the disequilibrium coefficient in the individual Hardy-Weinberg test. Biometrics 55:1269–1272

    Article  PubMed  CAS  Google Scholar 

  18. Barnetche T, Gourraud PA, Cambon-Thomsen A (2005) Strategies in analysis of the genetic component of multifactorial diseases; biostatistical aspects. Transpl Immunol 14(3–4):255–266

    Article  PubMed  Google Scholar 

  19. Guan Y, Stephens M (2008) Practical issues in imputation-based association mapping. PLoS Genet 4(12):e1000279

    Article  PubMed  Google Scholar 

  20. Piazza A (1975) Haplotypes and linkage disequilibria from three-locus phenotypes. In: Kissmeyer-Nielsen F (ed) Histocompatibility testing. Munskgaard, Copenhagen, pp 923–927

    Google Scholar 

  21. Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc 39:1–38

    Google Scholar 

  22. Ott J (1977) Counting methods (EM algorithm) in human pedigree analysis: linkage and segregation analysis. Ann Hum Genet 40:443–454

    PubMed  CAS  Google Scholar 

  23. Yasuda N (1978) Estimation of haplotype frequency and linkage disequilibrium parameter in the HLA system. Tissue Antigens 12:315–322

    Article  PubMed  CAS  Google Scholar 

  24. Morton NE, Simpson SP, Lew R, Yee S (1983) Estimation of haplotype frequencies. Tissue Antigens 22(4):257–262

    Article  PubMed  CAS  Google Scholar 

  25. Hawley ME, Kidd KK (1995) HAPLO: a program using the EM algorithm to estimate the frequencies of multisite haplotypes. J Hered 86:409–411

    PubMed  CAS  Google Scholar 

  26. Long JC, Williams RC, Urbanek M (1995) An E-M algorithm and testing strategy for multiple-locus haplotypes. Am J Hum Genet 56:799–810

    PubMed  CAS  Google Scholar 

  27. Fallin D, Schork NJ (2000) Accuracy of haplotype frequency estimation for biallelic loci, via the expectation-maximization algorithm for unphased diploid genotype data. Am J Hum Genet 67:947–959

    Article  PubMed  CAS  Google Scholar 

  28. Tishkoff SA, Pakstis AJ, Ruano G, Kidd KK (2000) The accuracy of statistical methods for estimation of haplotype frequencies: an example from the CD4 locus. Am J Hum Genet 67:518–522

    Article  PubMed  CAS  Google Scholar 

  29. Kirk KM, Cardon LR (2002) The impact of genotyping error on haplotype reconstruction and frequency estimation. Eur J Hum Genet 10:616–622

    Article  PubMed  CAS  Google Scholar 

  30. Single RM, Meyer D, Hollenbeck J, Nelson M, Noble JA, Erlich HA, Thomson G (2002) Haplotype frequency estimation in patient populations: the effect of departures from Hardy-Weinberg proportions and collapsing over a locus in the HLA region. Gen Epidemiol 22:186–195

    Article  Google Scholar 

  31. Stephens M, Smith NJ, Donnelly P (2001) A new statistical method for haplotype reconstruction from population data. Am J Hum Genet 68:978–989

    Article  PubMed  CAS  Google Scholar 

  32. Niu T, Qin ZS, Xu X, Liu JS (2002) Bayesian haplotype inference for multiple linked single-nucleotide polymorphisms. Am J Hum Genet 70:157–169

    Article  PubMed  CAS  Google Scholar 

  33. Qin ZS, Niu T, Liu JS (2002) Partition-ligation-expectationmaximization algorithm for haplotype inference with singlenucleotide polymorphisms. Am J Hum Genet 71:1242–1247

    Article  PubMed  CAS  Google Scholar 

  34. Stephens M, Donnelly P (2003) A comparison of Bayesian methods for haplotype reconstruction from population genotype data. Am J Hum Genet 73:1162–1169

    Article  PubMed  CAS  Google Scholar 

  35. Niu T (2004) Algorithms for inferring haplotypes. Genet Epidemiol 27(4):334–347

    Article  PubMed  Google Scholar 

  36. Slatkin M, Excoffier L (1996) Testing for linkage disequilibrium in genotypic data using the Expectation-Maximization algorithm. Heredity 76:377–383

    Article  PubMed  Google Scholar 

  37. Tishkoff SA, Kidd KK (2004) Implications of biogeography of human populations for ‘race’ and medicine. Nat Genet 36(Suppl 11):S21–S27

    Article  PubMed  CAS  Google Scholar 

  38. Robinson J, Waller MJ, Fail SC, Marsh SG (2006) The IMGT/HLA and IPD databases. Hum Mutat 27:1192–1199

    Article  PubMed  CAS  Google Scholar 

  39. Gourraud PA, Gagne K, Bignon JD, Cambon-Thomsen A, Middleton D (2007) Preliminary analysis of a KIR haplotype estimation algorithm: a simulation study. Tissue Antigens 69(Suppl 1):96–100

    Article  PubMed  Google Scholar 

  40. Yoo YJ, Tang J, Kaslow RA, Zhang K (2007) Haplotype inference for present-absent genotype data using previously identified haplotypes and haplotype patterns. Bioinformatics 23(18):2399–2406

    Article  PubMed  CAS  Google Scholar 

  41. Lewontin RC (1964) The interaction of selection and linkage. I. General considerations; heterotic models. Genetics 49:49–67

    PubMed  CAS  Google Scholar 

  42. Hill WG, Robertson A (1968) Linkage disequilibrium in finite populations. Theor Appl Genet (Der Ziichter) 38:226–231

    Article  Google Scholar 

  43. Klitz W, Stephen JC, Grote M, Carrington M (1995) Discordant patterns of linkage disequilibrium of the peptide transporter loci within the HLA class II region. Am J Hum Genetics 57:1436–1444

    CAS  Google Scholar 

  44. Cramer H (1946) Mathematical methods of statistics. Princeton University Press, Princeton, NJ

    Google Scholar 

  45. Cohen J (1988) Statistical power analysis for the behavioral sciences. Erlbaum, Hillsdale, NJ

    Google Scholar 

  46. Hedrick PW (1987) Gametic disequilibrium measures: proceed with caution. Genetics 117(2):331–341

    PubMed  CAS  Google Scholar 

  47. Abecasis GR, Cookson WO (2000) GOLD-graphical overview of linkage disequilibrium. Bioinformatics 16:182–183

    Article  PubMed  CAS  Google Scholar 

  48. Barrett JC, Fry B, Maller J, Daly MJ (2005) Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21(2):263–265

    Article  PubMed  CAS  Google Scholar 

  49. Gaunt TR, Rodriguez S, Zapata C, Day IN (2006) MIDAS: software for analysis and visualisation of interallelic disequilibrium between multiallelic markers. BMC Bioinformatics 7:227–237

    Article  PubMed  Google Scholar 

  50. Shin J-H, Blay S, McNeney B, Graham J (2006) LDheatmap: an R function for graphical display of pairwise linkage disequilibria between single nucleotide polymorphisms. J Stat Soft 16 Code Snippet 3

    Google Scholar 

  51. Ewens WJ (1972) The sampling theory of selectively neutral alleles. Theor Popul Biol 3(1):87–112

    Article  PubMed  CAS  Google Scholar 

  52. Watterson G (1978) The homozygosity test of neutrality. Genetics 88:405–417

    PubMed  CAS  Google Scholar 

  53. Slatkin M (1994) An exact test for neutrality based on the Ewens sampling distribution. Genet Res 64:71–74

    Article  PubMed  CAS  Google Scholar 

  54. Slatkin M (1996) A correction to the exact test based on the Ewens sampling distribution. Genet Res 68:259–260

    Article  PubMed  CAS  Google Scholar 

  55. Salamon H, Klitz W, Easteal S, Gao X, Erlich HA, Fernandez-Vina M, Trachtenberg EA (1999) Evolution of HLA class II molecules: allelic and amino acid site variability across populations. Genetics 152:393–400

    PubMed  CAS  Google Scholar 

  56. Conover W (1980) Practical nonparametric statistics. Wiley, New York

    Google Scholar 

  57. Chakravarti A (1991) Information content of the Cen tre d’Etude du Polymorphisme Humain (CEPH) family structures for linkage studies. Hum Genet 87:721–724

    Article  PubMed  CAS  Google Scholar 

  58. Wright S (1951) The genetic structure of populations. Ann Eugen 15:323–354

    Google Scholar 

  59. Nei M (1977) F-statistics and analysis of gene diversity in subdivided populations. Ann Hum Genet 41:225–233

    Article  PubMed  CAS  Google Scholar 

  60. Wright S (1978) Evolution and the genetics of populations, vol 4. The University of Chicago Press, Chicago, Variability Within and Among Natural Populations

    Google Scholar 

  61. Slatkin M (1991) Inbreeding coefficients and coalescence times. Genet Res 58:167–175

    Article  PubMed  CAS  Google Scholar 

  62. Weber JL, Wong C (1993) Mutation of human short tandem repeats. Hum Mol Genet 2(8):1123–1128

    Article  PubMed  CAS  Google Scholar 

  63. Gyapay G, Morissette J, Vignal A, Dib C, Fizames C, Millasseau P, Marc S, Bernardi G, Lathrop M, Weissenbach J (1994) The 1993–1994 Genethon human genetic linkage map. Nat Genet 7:246–339

    Article  PubMed  CAS  Google Scholar 

  64. Slatkin M (1995) A measure of population subdivision based on microsatellite allele frequencies. Genetics 139:457–462

    PubMed  CAS  Google Scholar 

  65. Reynolds J, Weir BS, Cockerham CC (1983) Estimation for the coancestry coefficient: basis for a short-term genetic distance. Genetics 105:767–779

    PubMed  CAS  Google Scholar 

  66. Weir BS, Cockerham CC (1984) Estimating F-statistics for the analysis of population structure. Evolution 38:1358–1370

    Article  Google Scholar 

  67. Hedrick PW (2005) A standardized genetic differentiation measure. Evolution 59:1633–1638

    PubMed  CAS  Google Scholar 

  68. Pearson K (1901) “On lines and planes of closest fit to systems of points in space” (PDF). Phil Mag 2(6):559–572

    Google Scholar 

  69. Cox TF, Cox MAA (2001) Multidimensional Scaling, 2nd edn. Chapman and Hall, Boca Raton, FL

    Google Scholar 

  70. Borg I, Groenen P (2005) Modern multidimensional scaling: theory and applications, 2nd edn. Springer, New York

    Google Scholar 

  71. Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics 155(2):945–959

    PubMed  CAS  Google Scholar 

  72. Hubisz MJ, Falush D, Stephens M, Pritchard JK (2009) Inferring weak population structure with the assistance of sample group information. Mol Ecol Resour 9(5):1322–1332

    Article  PubMed  Google Scholar 

  73. Rosenberg NA, Pritchard JK, Weber JL, Cann HM, Kidd KK, Zhivotovsky LA, Feldman MW (2002) Genetic structure of human populations. Science 298(5602):2381–2385

    Article  PubMed  CAS  Google Scholar 

  74. Karypis G. (2002) CLUTO: a clustering toolkit. Technical Report 02-017. University of Minnesota, Minneapolis, MN

    Google Scholar 

  75. Felsenstein J (1989) PHYLIP—Phylogeny Inference Package (Version 3.2). Cladistics 5:164–166

    Google Scholar 

  76. Felsenstein, J. (2005) PHYLIP (Phylogeny Inference Package) version 3.6 (Distributed by the author). Department of Genome Sciences, University of Washington, Seattle

    Google Scholar 

  77. Saitou N, Nei M (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 4(4):406–425

    PubMed  CAS  Google Scholar 

  78. Nei M (1972) Genetic distance between populations. Am Nat 106:283–292

    Article  Google Scholar 

  79. Cavalli-Sforza LL, Edwards AFW (1967) Phylogenetic analysis: models and estimation procedures. Am J Hum Genet 19:233–257

    PubMed  CAS  Google Scholar 

  80. Hughes AL, Nei M (1988) Pattern of nucleotide substitution at MHC class I loci reveals overdominant selection. Nature 335:167–170

    Article  PubMed  CAS  Google Scholar 

  81. Hughes AL, Yeager M (1998) Natural selection at major histocompatibility complex loci of vertebrates. Ann Rev Genet 32:415–435

    Article  PubMed  CAS  Google Scholar 

  82. Sanchez-Mazas A, Fernandez-Viña M, Middleton D, Hollenbach JA, Buhler S, Di D, Rajalingam R, Dugoujon JM, Mack SJ, Thorsby E (2011) Immunogenetics as a tool in anthropological studies. Immunology 133(2):143–164

    Article  PubMed  CAS  Google Scholar 

  83. Middleton D, Gonzalez F, Fernandez-Vina M, Tiercy JM, Marsh SG, Aubrey M, Bicalho MG, Canossi A, Carter V, Cate S, Guerini FR, Loiseau P, Martinetti M, Moraes ME, Morales V, Perasaari J, Setterholm M, Sprague M, Tavoularis S, Torres M, Vidal S, Witt C, Wohlwend G, Yang KL (2009) A bioinformatics approach to ascertaining the rarity of HLA alleles. Tissue Antigens 74:480–485

    Article  PubMed  CAS  Google Scholar 

  84. Cadavid LF, Watkins DI (1997) Heirs of the jaguar and the anaconda: HLA, conquest and disease in the indigenous populations of the Americas. Tissue Antigens 6:702–711

    Article  Google Scholar 

  85. Erlich HA, Mack SJ, Bergström T, Gyllensten UB (1997) HLA class II alleles in Amerindian populations: implications for the evolution of HLA polymorphism and the colonization of the Americas. Hereditas 127(1–2):19–24

    PubMed  CAS  Google Scholar 

  86. Sokal R, Michener C (1958) A statistical method for evaluating systematic relationships. Univ Kansas Sci Bull 38:1409–1438

    Google Scholar 

  87. Murtagh F (1984) Complexities of hierarchic clustering algorithms: the state of the art. Comput Stat Quart 1:101–113

    Google Scholar 

  88. Bergström TF, Josefsson A, Erlich HA, Gyllensten UB (1997) Analysis of intron sequences at the class II HLA-DRB1 locus: implications for the age of allelic diversity. Hereditas 127(1–2):1–5

    PubMed  Google Scholar 

  89. Bergström TF, Josefsson A, Erlich HA, Gyllensten U (1998) Recent origin of HLA-DRB1 alleles and implications for human evolution. Nat Genet 18(3):237–242

    Article  PubMed  Google Scholar 

  90. Felsenstein J (1985) Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39:783–791

    Article  Google Scholar 

  91. Penny D, Hendry MD (1985) Testing methods of evolutionary tree construction. Cladistics 1:266–278

    Article  Google Scholar 

  92. Cavalli-Sforza LL, Menozzi P, Piazza A (1994) The history and geography of human genes. Princeton University Press, Pinceton, NJ

    Google Scholar 

  93. Cavalli-Sforza LL, Feldman MW (2003) The application of molecular genetic approaches to the study of human evolution. Nat Genet 33:266–275

    Article  PubMed  CAS  Google Scholar 

  94. Mack SJ, Erlich HA (2007) Population relationships as inferred from classical HLA genes. 13th International histocompatibility workshop anthropology/human genetic diversity joint report. In: Hansen JA (ed) Immunobiology of the human MHC: Proceedings of the 13th international histocompatibility workshop and conference, vol I. IHWG, Seattle, pp. 747–757

    Google Scholar 

  95. Nei M (1978) Estimation of average heterozygosity and genetic distance from a small number of individuals. Genetics 89:583–590

    PubMed  CAS  Google Scholar 

Download references

Acknowledgments

This work was supported by National Institutes of Health (NIH) grants U01AI067068 (JAH, SJM) and U19 AI067152 (PAG) awarded by the National Institute of Allergy and Infectious Diseases (NIAID) and by NIH/NIAID contract AI40076 (RMS, GT). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institute of Allergy and Infectious Diseases or the National Institutes of Health.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Steven J. Mack .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Mack, S.J., Gourraud, PA., Single, R.M., Thomson, G., Hollenbach, J.A. (2012). Analytical Methods for Immunogenetic Population Data. In: Christiansen, F., Tait, B. (eds) Immunogenetics. Methods in Molecular Biology, vol 882. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-61779-842-9_13

Download citation

  • DOI: https://doi.org/10.1007/978-1-61779-842-9_13

  • Published:

  • Publisher Name: Humana Press, Totowa, NJ

  • Print ISBN: 978-1-61779-841-2

  • Online ISBN: 978-1-61779-842-9

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics