Optimization Algorithms for Identification and Genotyping of Copy Number Polymorphisms in Human Populations

  • Gökhan Yavaş
  • Mehmet Koyutürk
  • Thomas LaFramboise
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6282)


Recent studies show that copy number polymorphisms (CNPs), defined as genome segments that are polymorphic with regard to genomic copy number and segregate at greater than 1% frequency in the populations, are associated with various diseases. Since rare copy number variations (CNVs) and CNPs bear different characteristics, the problem of discovering CNPs presents opportunities beyond what is available to algorithms that are designed to identify rare CNVs. We present a method for identifying and genotyping common CNPs. The proposed method, POLYGON, produces copy number genotypes of the samples at each CNP and fine-tunes its boundaries by framing CNP identification and genotyping as an optimization problem with an explicitly formulated objective function. We apply POLYGON to data from hundreds of samples and demonstrate that it significantly improves the performance of existing single-sample CNV identification methods. We also demonstrate its superior performance as compared to two other CNP identification/genotyping methods.


CNV CNP optimization 


  1. 1.
    IHMC: A haplotype map of the human genome. Nature 437, 1241–1242 (2005)Google Scholar
  2. 2.
    Feuk, L., et al.: Structural variation in the human genome. Nat. Rev. Genet. 7, 85–97 (2006)CrossRefPubMedGoogle Scholar
  3. 3.
    Colella, S., et al.: QuantiSNP: an Objective Bayes Hidden-Markov Model to detect and accurately map copy number variation using SNP genotyping data. Nucleic Acids Res. 35, 2013–2025 (2007)CrossRefPubMedPubMedCentralGoogle Scholar
  4. 4.
    Korn, J.M., et al.: Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs. Nat. Genet. 40, 1253–1260 (2008)CrossRefPubMedPubMedCentralGoogle Scholar
  5. 5.
    Olshen, A.B., et al.: Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics 5, 557–572 (2004)CrossRefPubMedGoogle Scholar
  6. 6.
    Wang, K., et al.: PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data. Genome Res. 17, 1665–1674 (2007)CrossRefPubMedPubMedCentralGoogle Scholar
  7. 7.
    Yavaş, G., et al.: An optimization framework for unsupervised identification of rare copy number variation from SNP array data. Genome Biology 10, R119 (2009)CrossRefGoogle Scholar
  8. 8.
    Gonzalez, E., et al.: The influence of CCL3L1 gene-containing segmental duplications on HIV-1/AIDS susceptibility. Science 307, 1434–1440 (2005)CrossRefPubMedGoogle Scholar
  9. 9.
    Aitman, T.J., et al.: Copy number polymorphism in Fcgr3 predisposes to glomerulonephritis in rats and humans. Nature 439, 851–855 (2006)CrossRefPubMedGoogle Scholar
  10. 10.
    Fanciulli, M., et al.: FCGR3B copy number variation is associated with susceptibility to systemic, but not organ-specific, autoimmunity. Nat. Genet. 39, 721–723 (2007)CrossRefPubMedPubMedCentralGoogle Scholar
  11. 11.
    Yang, Y., et al.: Gene copy-number variation and associated polymorphisms of complement component C4 in human systemic lupus erythematosus (SLE): low copy number is a risk factor for and high copy number is a protective factor against SLE susceptibility in European Americans. Am. J. Hum. Genet. 80, 1037–1054 (2007)CrossRefPubMedPubMedCentralGoogle Scholar
  12. 12.
    Shu Mei, T., et al.: Identification of recurrent regions of copy-number variants across multiple individuals. BMC Bioinformatics 11, 147 (2010)CrossRefGoogle Scholar
  13. 13.
    McCarroll, S.A., et al.: Integrated detection and population-genetic analysis of SNPs and copy number variation. Nat. Genet. 40, 1166–1174 (2008)CrossRefPubMedGoogle Scholar
  14. 14.
    Pinto, D., et al.: Copy-number variation in control population cohorts. Hum. Mol. Genet. 16, R168–R173 (2007)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Gökhan Yavaş
    • 1
  • Mehmet Koyutürk
    • 1
    • 3
  • Thomas LaFramboise
    • 2
    • 3
  1. 1.Department of Electrical Engineering & Computer ScienceCase Western Reserve UniversityClevelandUSA
  2. 2.Department of GeneticsCase Western Reserve UniversityClevelandUSA
  3. 3.Center for Proteomics & BioinformaticsCase Western Reserve UniversityClevelandUSA

Personalised recommendations