Springer Nature is making Coronavirus research free. View research | View latest news | Sign up for updates

Germ-line DNA copy number variation frequencies in a large North American population


Genomic copy number variation (CNV) is a recently identified form of global genetic variation in the human genome. The Affymetrix GeneChip 100 and 500 K SNP genotyping platforms were used to perform a large-scale population-based study of CNV frequency. We constructed a genomic map of 578 CNV regions, covering approximately 220 Mb (7.3%) of the human genome, identifying 183 previously unknown intervals. Copy number changes were observed to occur infrequently (<1%) in the majority (>93%) of these genomic regions, but encompass hundreds of genes and disease loci. This North American population-based map will be a useful resource for future genetic studies.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2


  1. Aitman TJ, Dong R, Vyse TJ, Norsworthy PJ, Johnson MD, Smith J, Mangion J, Roberton-Lowe C, Marshall AJ, Petretto E, Hodges MD, Bhangal G, Patel SG, Sheehan-Rooney K, Duda M, Cook PR, Evans DJ, Domin J, Flint J, Boyle JJ, Pusey CD, Cook HT (2006) Copy number polymorphism in Fcgr3 predisposes to glomerulonephritis in rats and humans. Nature 439:851–855

  2. Berciano J, Calleja J, Combarros O (1994) Charcot-Marie-Tooth disease. Neurology 44:1985–1986

  3. Conrad DF, Andrews TD, Carter NP, Hurles ME, Pritchard JK (2006) A high-resolution survey of deletion polymorphism in the human genome. Nat Genet 38:75–81

  4. Cotterchio M, McKeown-Eyssen G, Sutherland H, Buchan G, Aronson M, Easson AM, Macey J, Holowaty E, Gallinger S (2000) Ontario familial colon cancer registry: methods and first-year response rates. Chronic Dis Can 21:81–86

  5. Di X, Matsuzaki H, Webster TA, Hubbell E, Liu G, Dong S, Bartell D, Huang J, Chiles R, Yang G, Shen MM, Kulp D, Kennedy GC, Mei R, Jones KW, Cawley S (2005) Dynamic model based algorithms for screening and genotyping over 100 K SNPs on oligonucleotide microarrays. Bioinformatics 21:1958–1963

  6. Fanciulli M, Norsworthy PJ, Petretto E, Dong R, Harper L, Kamesh L, Heward JM, Gough SC, de Smith A, Blakemore AI, Froguel P, Owen CJ, Pearce SH, Teixeira L, Guillevin L, Graham DS, Pusey CD, Cook HT, Vyse TJ, Aitman TJ (2007) FCGR3B copy number variation is associated with susceptibility to systemic, but not organ-specific, autoimmunity. Nat Genet 39:721–723

  7. Fellermann K, Stange DE, Schaeffeler E, Schmalzl H, Wehkamp J, Bevins CL, Reinisch W, Teml A, Schwab M, Lichter P, Radlwimmer B, Stange EF (2006) A chromosome 8 gene-cluster polymorphism with low human beta-defensin 2 gene copy number predisposes to Crohn’s disease of the colon. Am J Hum Genet 79:439–448

  8. Freeman JL, Perry GH, Feuk L, Redon R, McCarroll SA, Altshuler DM, Aburatani H, Jones KW, Tyler-Smith C, Hurles ME, Carter NP, Scherer SW, Lee C (2006) Copy number variation: new insights in genome diversity. Genome Res 8:949–961

  9. Gonzalez E, Kulkarni H, Bolivar H, Mangano A, Sanchez R, Catano G, Nibbs RJ, Freedman BI, Quinones MP, Bamshad MJ, Murthy KK, Rovin BH, Bradley W, Clark RA, Anderson SA, O’connell RJ, Agan BK, Ahuja SS, Bologna R, Sen L, Dolan MJ, Ahuja SK (2005) The influence of CCL3L1 gene-containing segmental duplications on HIV-1/AIDS susceptibility. Science 307:1434–1440

  10. Hamosh A, Scott AF, Amberger J, Bocchini C, Valle D, McKusick VA (2006) Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res 30:52–55

  11. Higgins ME, Claremount M, Major JE, Sander C, Lash AE (2006) Cancer Genes: a gene selection resource for cancer genome projects. Nucleic Acids Res 35:D721–D726

  12. Hinds DA, Kloek AP, Jen M, Chen X, Frazer KA (2006) Common deletions and SNPs are in linkage disequilibrium in the human genome. Nat Genet 38:82–85

  13. Hua J, Craig DW, Brun M, Webster J, Zismann V, Tembe W, Joshipura K, Huentelman MJ, Dougherty ER, Stephan DA (2007) SNiPer-HD: improved genotype calling accuracy by an expectation-maximization algorithm for high-density SNP arrays. Bioinformatics 23:57–63

  14. Huang J, Wei W, Chen J, Zhang J, Liu G, Di X, Mei R, Ishikawa S, Aburatani H, Jones KW, Shapero MH (2006) CARAT: a novel method for allelic detection of DNA copy number changes using high density oligonucleotide arrays. BMC Bioinformatics 7:83

  15. Iafrate AJ, Feuk L, Rivera MN, Listewnik ML, Donahoe PK, Qi Y, Scherer SW, Lee C (2004) Detection of large-scale variation in the human genome. Nat Genet 36:949–951

  16. Karolchik D, Baertsch R, Diekhans M, Furey TS, Hinrichs A, Lu YT, Roskin KM, Schwartz M, Sugnet CW, Thomas DJ, Weber RJ, Haussler D, Kent WJ (2003) The UCSC Genome Browser Database. Nucleic Acids Res 31:51–54

  17. Karolchik D, Hinrichs AS, Furey TS, Roskin KM, Sugnet CW, Haussler D, Kent WJ (2004) The UCSC Table Browser data retrieval tool. Nucleic Acids Res 32:D493–D496

  18. Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, Haussler D (2002) The human genome browser at UCSC. Genome Res 12:996–1006

  19. Le Marechal C, Masson E, Chen JM, Morel F, Ruszniewski P, Levy P, Ferec C (2006) Hereditary pancreatitis caused by triplication of the trypsinogen locus. Nat Genet 38:1372–1374

  20. Lee JA, Lupski JR (2006) Genomic rearrangements and gene copy-number alterations as a cause of nervous system disorders. Neuron 52:103–121

  21. McCarroll S, Hadnott TH, Perry GH, Sabeti PC, Zody MC, Barrett JC, Dallaire S, Gabriel SB, Lee C, Daly MJ, Altshuler DM, The International HapMap Consortium (2006) Common deletion polymorphisms in the human genome. Nat Genet 38:86–92

  22. Nannya Y, Sanada M, Nakazaki K, Hosoya N, Wang L, Hangaishi A, Kurokawa M, Chiba S, Bailey DK, Kennedy GC, Ogawa S (2005) A robust algorithm for copy number detection using high-density oligonucleotide single nucleotide polymorphism genotyping arrays. Cancer Res 65:6071–9

  23. Redon R, Ishikawa S, Fitch KR, Feuk L, Perry GH, Andrews TD, Fiegler H et al (2006) Global variation in copy number in the human genome. Nature 444:444–454

  24. Sebat J, Lakshmi B, Troge J, Alexander J, Young J, Lundin P, Maner S, Massa H, Walker M, Chi M, Navin N, Lucito R, Healy J, Hicks J, Ye K, Reiner A, Gilliam TC, Trask B, Patterson N, Zetterberg A, Wigler M (2004) Large-scale copy number polymorphism in the human genome. Science 305:525–528

  25. Sharp AJ, Locke DP, McGrath SD, Cheng Z, Bailey JA, Vallente RU, Pertz LM, Clark RA, Schwartz S, Segraves R, Oseroff VV, Albertson DG, Pinkel D, Eichler EE (2005) Segmental duplications and copy-number variation in the human genome. Am J Hum Genet 77:78–88

  26. Sjoblom T, Jones S, Wood LD, Parsons DW, Lin J, Barber TD, Mandelker D et al (2006) The consensus coding sequences of human breast and colorectal cancers. Science 314:268–274

  27. Tuzun E, Sharp AJ, Bailey JA, Kaul R, Morrison VA, Pertz LM, Haugen E, Hayden H, Albertson D, Pinkel D, Olson MV, Eichler EE (2005) Fine-scale structural variation of the human genome. Nat Genet 37:727–732

  28. Wong KK, deLeeuw RJ, Dosanjh NS, Kimm LR, Cheng Z, Horsman DE, MacAulay C, Ng RT, Brown CJ, Eichler EE, Lam WL (2007) A comprehensive analysis of common copy-number variations in the human genome. Am J Hum Genet 80:91–104

Download references


This work was supported by the National Cancer Institute, National Institutes of Health under RFA # CA-96-011 (to SG & MC) and through cooperative agreements with members of the Colon Cancer Family Registry and P.I.s. The content of this manuscript does not necessarily reflect the views or policies of the National Cancer Institute or any of the collaborating institutions or investigators in the Colon CFR, nor does mention of trade names, commercial products, or organizations imply endorsement by the US Government or the Colon CFR. SG is the recipient of a grant from the Lustgarten Foundation for Pancreas Cancer Research, which also supported this work. Cancer Care Ontario, as the host organization to the ARCTIC Genome Project, acknowledges that this Project was funded by Genome Canada through the Ontario Genomics Institute, by Génome Québec, the Ministère du Développement Économique et Régional et de la Recherche du Québec and the Ontario Institute for Cancer Research. GZ is a Scholar of the Society of University Surgeons and a recipient of a Terry Fox Foundation Research Fellowship from the National Cancer Institute of Canada. The authors thank Dr. S. Ogawa for providing early access to CNAG version 2, Drs. C. Marshall and L. Feuk for their advice, Dr. D. Daftary, and Ms. T. Selander, Dr. Ling Liu and the Mount Sinai Hospital Biospecimen Repository for technical assistance.

Author information

Correspondence to Steven Gallinger.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Table 1. CNVR Coordinates, Population Frequency and Ancestry-Associations. (XLS 120 kb)

Supplementary Table 2. Enriched gene ontology categories in CNVRs. Known genes that overlapped with CNVRs were tested for over or under-representation of specific Gene Ontology (GO) gene function annotation terms (The Gene Ontology Consortium 2000) using the BiNGO software (Maere et al. 2005) and updated human GO annotation downloaded on Dec.3.2006. 1351 unique genes were tested against the entire GO reference set, consisting of 16,123 annotated genes. Assessment of significance was conducted using the hypergeometric test and Benjamini & Hochberg False Discovery Rate multiple testing correction. UniProt (Wu et al. 2006) identifiers (IDs) for each known gene were converted to Entrez Gene IDs (Wheeler et al. 2006), using Expasy (Gasteiger et al. 2003) and Ensembl (Clamp et al. 2003). The following GO terms were enriched in CNVRs, while a significant impoverishment of GO categories was not observed. (DOC 34 kb)

Supplementary Table 3. CNVRs associated with 55 cancer genes of 406 genes known to be involved in cancer, downloaded from the Cancer Genes resource (Higgins et al. 2006), and 189 genes reported in a recent paper by Sjoblom et al. (2006). (XLS 27 kb)

Supplementary Table 4. CNVRs associated with OMIM Morbid Map genes (Hamosh et al. 2002, downloaded in November 2006). (XLS 47 kb)

Supplementary Table 5. Novel CNVRs. (XLS 64 kb)

Supplementary Table 6. CNVRs not identified in the HapMap sample collection (Redon et al. 2006). (XLS 45 kb)

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Zogopoulos, G., Ha, K.C.H., Naqib, F. et al. Germ-line DNA copy number variation frequencies in a large North American population. Hum Genet 122, 345–353 (2007).

Download citation


  • Affymetrix GeneChip
  • Copy Number Change
  • Copy Number Alteration
  • Population Frequency
  • Copy Number Gain