Abstract
This chapter provides an elementary introduction to some of the basic biology and technology that underlies genetic association studies that rely on dense genotyping of nominally unrelated individuals to discover genetic variants related to risk of disease and other outcomes, phenotypes, or traits. This chapter discusses relevant aspects of DNA and RNA architecture, coding of amino acids, describes chromosomal organization, gives an overview of the most common types of sequence variation, and provides an overview of genotyping methods. It introduces concepts, databases, analysis programs, and example data that will be used in later portions of the book.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
These are DNA codons; RNA codons (translated from DNA for the purpose of interaction with the ribosome in protein formation) substitute thymine (T) with uracil U since T is replaced by U in RNA molecules.
References
Watson, J. D., & Crick, F. H. (1953). Molecular structure of nucleic acids; a structure for deoxyribose nucleic acid. Nature, 171, 737–738.
Watson, J. D., & Crick, F. H. (1953). Genetical implications of the structure of deoxyribonucleic acid. Nature, 171, 964–967.
Darwin, C. (1859). On the origin of species by means of natural selection, or, the preservation of favoured races in the struggle for life. London: John Murray.
Muller, H. J. (1927). Artificial transmutation of the gene. Science, 66, 84–87.
Beadle, G. W., & Tatum, E. L. (1941). Genetic control of biochemical reactions in Neurospora. Proceedings of the National Academy of Sciences USA, 27, 499–506.
Kimura, M., & Crow, J. F. (1964). The number of alleles that can be maintained in a finite population. Genetics, 49, 725–738.
Kimura, M. (1983). Rare variant alleles in the light of the neutral theory. Molecular Biology and Evolution, 1, 84–93.
Baker, B. S., Carpenter, A. T., Esposito, M. S., Esposito, R. E., & Sandler, L. (1976). The genetic control of meiosis. Annual Review of Genetics, 10, 53–134.
Kabak, D. B. (1996). Chromosome-size dependent control of meiotic recombination in humans. Nature Genetics, 13, 20–21.
KabackD, B., Guacci, V., Barber, D., & Mahon, J. W. (1992). Chromosome size-dependent control of meiotic recombination. Science, 256, 228–232.
Okamoto, I., Otte, A. P., Allis, C. D., Reinberg, D., & Heard, E. (2004). Epigenetic dynamics of imprinted X inactivation during early mouse development. Science, 303, 644–649.
Ecker, J. R., Bickmore, W. A., Barroso, I., Pritchard, J. K., Gilad, Y., & Segal, E. (2012). Genomics: ENCODE explained. Nature, 489, 52–55.
Abecasis, G. R., Auton, A., Brooks, L. D., DePristo, M. A., Durbin, R. M., Handsaker, R. E., et al. (2012). An integrated map of genetic variation from 1,092 human genomes. Nature, 491, 56–65.
King, M. C., Marks, J. H., & Mandell, J. B. (2003). Breast and ovarian cancer risks due to inherited mutations in BRCA1 and BRCA2. Science, 302, 643–646.
Kumar, P., Henikoff, S., & Ng, P. C. (2009). Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nature Protocols, 4, 1073–1081.
Adzhubei, I. A., Schmidt, S., Peshkin, L., Ramensky, V. E., Gerasimova, A., Bork, P., et al. (2010). A method and server for predicting damaging missense mutations. Nature Methods, 7, 248–249.
Nelson, S. (2011, March 25). UW Genetics Coordinating Center: the + and – of DNA strand issues, p 21. University of Washington, Seattle, WA
Zhang, Z., & Gerstein, M. (2003). Patterns of nucleotide substitution, insertion and deletion in the human genome inferred from pseudogenes. Nucleic Acids Research, 31, 5338–5348.
Iafrate, A. J., Feuk, L., Rivera, M. N., Listewnik, M. L., Donahoe, P. K., Qi, Y., et al. (2004). Detection of large-scale variation in the human genome. Nature Genetics, 36, 949–951.
Sebat, J., Lakshmi, B., Troge, J., Alexander, J., Young, J., Lundin, P., et al. (2004). Large-scale copy number polymorphism in the human genome. Science, 305, 525–528.
Feuk, L., Carson, A. R., & Scherer, S. W. (2006). Structural variation in the human genome. Nature Reviews. Genetics, 7, 85–97.
Bunin, G. R., Needle, M., & Riccardi, V. M. (1997). Paternal age and sporadic neurofibromatosis 1: a case-control study and consideration of the methodologic issues. Genetic Epidemiology, 14, 507–516.
Haiman, C. A., Han, Y., Feng, Y., Xia, L., Hsu, C., Sheng, X., et al. (2013). Genome-wide testing of putative functional exonic variants in relationship with breast and prostate cancer risk in a multiethnic population. PLoS Genetics, 9, e1003419.
Nackley, A. G., Shabalina, S. A., Tchivileva, I. E., Satterfield, K., Korchynskyi, O., Makarov, S. S., et al. (2006). Human catechol-O-methyltransferase haplotypes modulate protein expression by altering mRNA secondary structure. Science, 314, 1930–1933.
Tamiya, G., Shinya, M., Imanishi, T., Ikuta, T., Makino, S., Okamoto, K., et al. (2005). Whole genome association study of rheumatoid arthritis using 27 039 microsatellites. Human Molecular Genetics, 14, 2305–2321.
Taylor, R. W., & Turnbull, D. M. (2005). Mitochondrial DNA mutations in human disease. Nature Reviews. Genetics, 6, 389–402.
Holt, I. J., Harding, A. E., & Morgan-Hughes, J. A. (1988). Deletions of muscle mitochondrial DNA in patients with mitochondrial myopathies. Nature, 331, 717–719.
Illumina. (2009). Improved cluster generation with Gentrain2, Illumina Inc, San Diego
Schillert, A., & Ziegler, A. (2012). Genotype calling for the Affymetrix platform. Methods in Molecular Biology, 850, 513–523.
Korn, J. M., Kuruvilla, F. G., McCarroll, S. A., Wysoker, A., Nemesh, J., Cawley, S., et al. (2008). Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs. Nature Genetics, 40, 1253–1260.
Giannoulatou, E., Yau, C., Colella, S., Ragoussis, J., & Holmes, C. C. (2008). GenoSNP: a variational Bayes within-sample SNP genotyping algorithm that does not require a reference population. Bioinformatics, 24, 2209–2214.
Carvalho, B., Bengtsson, H., Speed, T. P., & Irizarry, R. A. (2007). Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data. Biostatistics, 8, 485–499.
Browning, B. L., & Yu, Z. (2009). Simultaneous genotype calling and haplotype phasing improves genotype accuracy and reduces false-positive associations for genome-wide association studies. American Journal of Human Genetics, 85, 847–861.
AFFYMETRIX. (2006). BRLMM: an improved genotype calling method for the GeneChip Human Mapping 500K Array Set. Santa Clara, CA: Affymetrix.
Li, G., Gelernter, J., Kranzler, H. R., & Zhao, H. (2012). M(3): an improved SNP calling algorithm for Illumina BeadArray data. Bioinformatics, 28, 358–365.
Eeles, R. A., Olama, A. A. A., Benlloch, S., Saunders, E. J., Leongamornlert, D. A., Tymrakiewicz, M., Ghoussaini, M., et al. (2013) Identification of 23 novel prostate cancer susceptibility loci using a custom array (the iCOGS) in an international consortium, PRACTICAL. Nature Genetics, 45, 385–391.
Wu, Y., Waite, L. L., Jackson, A. U., Sheu, W. H. H., Buyske, S., Absher, D., et al. (2013). Trans-ethnic fine-mapping of lipid loci identifies population-specific signals and allelic heterogeneity that increases the trait variance explained. PLoS Genetics, 9, e1003379.
Purcell, S., Neale, B., Todd-Brown, K., Thomas, L., Ferreira, M. A., Bender, D., et al. (2007). PLINK: a tool set for whole-genome association and population-based linkage analyses. American Journal of Human Genetics, 81, 559–575.
Price, A. L., Patterson, N. J., Plenge, R. M., Weinblatt, M. E., Shadick, N. A., & Reich, D. (2006). Principal components analysis corrects for stratification in genome-wide association studies. Nature Genetics, 38, 904–909.
Li, Y., Willer, C. J., Ding, J., Scheet, P., & Abecasis, G. R. (2010). MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes. Genetic Epidemiology, 34, 816–834.
Delaneau, O., Marchini, J., & Zagury, J.-F. (2011). A linear complexity phasing method for thousands of genomes. Nature Methods, 9, 179–181.
Browning, B. L., & Browning, S. R. (2009). A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals. American Journal of Human Genetics, 84, 210–223.
Howie, B. N., Donnelly, P., & Marchini, J. (2009). Impute2: a flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genetics, 5, e1000529.
Gauderman, W., & Morrison, J. (2006). QUANTO 1.1: a computer program for power and sample size calculations for genetic-epidemiology studies. http://hydra.usc.edu/gxe
Kang, H. M., Sul, J. H., Service, S. K., Zaitlen, N. A., Kong, S. Y., Freimer, N. B., et al. (2010). Variance component model to account for sample structure in genome-wide association studies. Nature Genetics, 42, 348–354.
Yang, J., Lee, S. H., Goddard, M. E., & Visscher, P. M. (2011). GCTA: a tool for genome-wide complex trait analysis. The American Journal of Human Genetics, 88, 76–82.
Fu, W., O’Connor, T. D., Jun, G., Kang, H. M., Abecasis, G., Leal, S. M., et al. (2013). Analysis of 6,515 exomes reveals the recent origin of most human protein-coding variants. Nature, 493, 216–220.
Hindorff, L. A., Sethupathy, P., Junkins, H. A., Ramos, E. M., Mehta, J. P., Collins, F. S., et al. (2009). Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proceedings of the National Academy of Sciences USA, 106, 9362–9367.
Coetzee, S. G., Rhie, S. K., Berman, B. P., Coetzee, G. A., & Noushmehr, H. (2012). FunciSNP: an R/bioconductor tool integrating functional non-coding data sets with genetic association studies to identify candidate regulatory SNPs. Nucleic Acids Research, 40, e139.
GTEx Consortium. (2013). The Genotype-Tissue Expression (GTEx) project. Nature Genetics, 45, 580–585.
Yang, T. P., Beazley, C., Montgomery, S. B., Dimas, A. S., Gutierrez-Arcelus, M., Stranger, B. E., et al. (2010). Genevar: a database and Java application for the analysis and visualization of SNP-gene associations in eQTL studies. Bioinformatics, 26, 2474–2476.
Nica, A. C., Parts, L., Glass, D., Nisbet, J., Barrett, A., Sekowska, M., et al. (2011). The architecture of gene regulatory variation across multiple human tissues: the MuTHER study. PLoS Genetics, 7, e1002003.
Grundberg, E., Small, K. S., Hedman, A. K., Nica, A. C., Buil, A., Keildson, S., et al. (2012). Mapping cis- and trans-regulatory effects across multiple tissues in twins. Nature Genetics, 44, 1084–1089.
Stranger, B. E., Montgomery, S. B., Dimas, A. S., Parts, L., Stegle, O., Ingle, C. E., et al. (2012). Patterns of cis regulatory variation in diverse human populations. PLoS Genetics, 8, e1002639.
Dimas, A. S., Deutsch, S., Stranger, B. E., Montgomery, S. B., Borel, C., Attar-Cohen, H., et al. (2009). Common regulatory variation impacts gene expression in a cell type-dependent manner. Science, 325, 1246–1250.
Boyle, A. P., Hong, E. L., Hariharan, M., Cheng, Y., Schaub, M. A., Kasowski, M., et al. (2012). Annotation of functional variation in personal genomes using RegulomeDB. Genome Research, 22, 1790–1797.
Ward, L. D., & Kellis, M. (2012). HaploReg: a resource for exploring chromatin states, conservation, and regulatory motif alterations within sets of genetically linked variants. Nucleic Acids Research, 40, D930–D934.
Cheng, I., Chen, G. K., Nakagawa, H., He, J., Wan, P., Lurie, C., et al. (2012). Evaluating genetic risk for prostate cancer among Japanese and Latinos. Cancer Epidemiology, Biomarkers & Prevention, 21(11), 2048–2058.
Kolata, G. (2012, September 15). Bits of mystery DNA, far from ‘junk,’ play crucial role. New York Times, New York, NY
Graur, D., Zheng, Y., Price, N., Azevedo, R. B. R., Zufall, R. A., & Elhaik, E. (2013). On the immortality of television sets: “function” in the human genome according to the evolution-free gospel of ENCODE. Genome Biology and Evolution, 5, 578–590.
Kwok, P. Y. (2001). Methods for genotyping single nucleotide polymorphisms. Annual Review of Genomics and Human Genetics, 2, 235–258.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer Science+Business Media New York
About this chapter
Cite this chapter
Stram, D.O. (2014). Introduction. In: Design, Analysis, and Interpretation of Genome-Wide Association Scans. Statistics for Biology and Health. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-9443-0_1
Download citation
DOI: https://doi.org/10.1007/978-1-4614-9443-0_1
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-9442-3
Online ISBN: 978-1-4614-9443-0
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)