Haplotype Allele Frequency (HAF) Score: Predicting Carriers of Ongoing Selective Sweeps Without Knowledge of the Adaptive Allele

  • Roy Ronen
  • Glenn Tesler
  • Ali Akbari
  • Shay Zakov
  • Noah A. Rosenberg
  • Vineet BafnaEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9029)


Methods for detecting the genomic signatures of natural selection are heavily studied, and have been successful in identifying many selective sweeps. For the vast majority of these sweeps the adaptive allele remains unknown, making it difficult to distinguish carriers of the sweep from non-carriers. Because carriers of ongoing selective sweeps are likely to contain a future most recent common ancestor, identifying them may prove useful in predicting the evolutionary trajectory– for example, in contexts involving drug-resistant pathogen strains or cancer subclones. The main contribution of this paper is the development and analysis of a new statistic, the Haplotype Allele Frequency (HAF) score, assigned to individual haplotypes in a sample. The HAF score naturally captures many of the properties shared by haplotypes carrying an adaptive allele. We provide a theoretical model for the behavior of the HAF score under different evolutionary scenarios, and validate the interpretation of the statistic with simulated data. We develop an algorithm (\(\text {PreCIOSS}\): Predicting Carriers of Ongoing Selective Sweeps) to identify carriers of the adaptive allele in selective sweeps, and we demonstrate its power on simulations of both hard and soft selective sweeps, as well as on data from well-known sweeps in human populations.


Favored Allele Selective Sweep Prostate Stem Cell Antigen Coalescent Simulation Coalescent Model 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Lachance, J., Tishkoff, S.A.: Population Genomics of Human Adaptation. Annu. Rev. Ecol. Evol. Syst. 44, 123–143 (2013)CrossRefGoogle Scholar
  2. 2.
    Vitti, J.J., Grossman, S.R., Sabeti, P.C.: Detecting natural selection in genomic data. Annu. Rev. Genet. 47, 97–120 (2013)CrossRefGoogle Scholar
  3. 3.
    Nielsen, R., Williamson, S., Kim, Y., Hubisz, M.J., Clark, A.G., Bustamante, C.: Genomic scans for selective sweeps using snp data. Genome Research 15(11), 1566–1575 (2005)CrossRefGoogle Scholar
  4. 4.
    Pickrell, J.K., Coop, G., Novembre, J., Kudaravalli, S., Li, J.Z., Absher, D., Srinivasan, B.S., Barsh, G.S., Myers, R.M., Feldman, M.W., Pritchard, J.K.: Signals of recent positive selection in a worldwide sample of human populations. Genome Res. 19, 826–837 (2009)CrossRefGoogle Scholar
  5. 5.
    Chen, H., Patterson, N., Reich, D.: Population differentiation as a test for selective sweeps. Genome Res. 20, 393–402 (2010)CrossRefGoogle Scholar
  6. 6.
    Berg, J.J., Coop, G.: A population genetic signal of polygenic adaptation. PLoS Genet. 10, e1004412 (2014)CrossRefGoogle Scholar
  7. 7.
    Jeong, C., Di Rienzo, A.: Adaptations to local environments in modern human populations. Curr. Opin. Genet. Dev. 29C, 1–8 (2014)CrossRefGoogle Scholar
  8. 8.
    Tekola-Ayele, F., Adeyemo, A., Chen, G., Hailu, E., Aseffa, A., Davey, G., Newport, M.J., Rotimi, C.N.: Novel genomic signals of recent selection in an Ethiopian population. Eur. J. Hum. Genet., November 2014Google Scholar
  9. 9.
    Yi, X., et al.: Sequencing of 50 human exomes reveals adaptation to high altitude. Science 329(5987), 75–78 (2010)CrossRefGoogle Scholar
  10. 10.
    Simonson, T.S., Yang, Y., Huff, C.D., Yun, H., Qin, G., Witherspoon, D.J., Bai, Z., Lorenzo, F.R., Xing, J., Jorde, L.B., Prchal, J.T., Ge, R.: Genetic evidence for high-altitude adaptation in Tibet. Science 329, 72–75 (2010)CrossRefGoogle Scholar
  11. 11.
    Scheinfeldt, L.B., Soi, S., Thompson, S., Ranciaro, A., Woldemeskel, D., Beggs, W., Lambert, C., Jarvis, J.P., Abate, D., Belay, G., Tishkoff, S.A.: Genetic adaptation to high altitude in the Ethiopian highlands. Genome Biol. 13(1), R1 (2012)CrossRefGoogle Scholar
  12. 12.
    Alkorta-Aranburu, G., Beall, C.M., Witonsky, D.B., Gebremedhin, A., Pritchard, J.K., Di Rienzo, A.: The genetic architecture of adaptations to high altitude in Ethiopia. PLoS Genet. 8(12), e1003110 (2012)CrossRefGoogle Scholar
  13. 13.
    Huerta-Sanchez, E., Degiorgio, M., Pagani, L., Tarekegn, A., Ekong, R., Antao, T., Cardona, A., Montgomery, H.E., Cavalleri, G.L., Robbins, P.A., Weale, M.E., Bradman, N., Bekele, E., Kivisild, T., Tyler-Smith, C., Nielsen, R.: Genetic signatures reveal high-altitude adaptation in a set of ethiopian populations. Mol. Biol. Evol. 30, 1877–1888 (2013)CrossRefGoogle Scholar
  14. 14.
    Udpa, N., Ronen, R., Zhou, D., Liang, J., Stobdan, T., Appenzeller, O., Yin, Y., Du, Y., Guo, L., Cao, R., Wang, Y., Jin, X., Huang, C., Jia, W., Cao, D., Guo, G., Claydon, V.E., Hainsworth, R., Gamboa, J.L., Zibenigus, M., Zenebe, G., Xue, J., Liu, S., Frazer, K.A., Li, Y., Bafna, V., Haddad, G.G.: Whole genome sequencing of Ethiopian highlanders reveals conserved hypoxia tolerance genes. Genome Biol. 15, R36 (2014)CrossRefGoogle Scholar
  15. 15.
    Zhou, D., Udpa, N., Ronen, R., Stobdan, T., Liang, J., Appenzeller, O., Zhao, H.W., Yin, Y., Du, Y., Guo, L., Cao, R., Wang, Y., Jin, X., Huang, C., Jia, W., Cao, D., Guo, G., Gamboa, J.L., Villafuerte, F., Callacondo, D., Xue, J., Liu, S., Frazer, K.A., Li, Y., Bafna, V., Haddad, G.G.: Whole-genome sequencing uncovers the genetic basis of chronic mountain sickness in Andean highlanders. Am. J. Hum. Genet. 93, 452–462 (2013)CrossRefGoogle Scholar
  16. 16.
    Tajima, F.: Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123, 585–595 (1989)Google Scholar
  17. 17.
    Fay, J.C., Wu, C.I.: Hitchhiking under positive Darwinian selection. Genetics 155, 1405–1413 (2000)Google Scholar
  18. 18.
    Pavlidis, P., Jensen, J.D., Stephan, W.: Searching for footprints of positive selection in whole-genome snp data from nonequilibrium populations. Genetics 185(3), 907–922 (2010)CrossRefGoogle Scholar
  19. 19.
    Lin, K., Li, H., Schltterer, C., Futschik, A.: Distinguishing positive selection from neutral evolution: Boosting the performance of summary statistics. Genetics 187(1), 229–244 (2011)CrossRefGoogle Scholar
  20. 20.
    Ronen, R., Udpa, N., Halperin, E., Bafna, V.: Learning natural selection from the site frequency spectrum. Genetics 195, 181–193 (2013)CrossRefGoogle Scholar
  21. 21.
    Simonsen, K.L., Churchill, G.A., Aquadro, C.F.: Properties of statistical tests of neutrality for DNA polymorphism data. Genetics 141, 413–429 (1995)Google Scholar
  22. 22.
    Braverman, J.M., Hudson, R.R., Kaplan, N.L., Langley, C.H., Stephan, W.: The hitchhiking effect on the site frequency spectrum of DNA polymorphisms. Genetics 140, 783–796 (1995)Google Scholar
  23. 23.
    Hudson, R.R., Bailey, K., Skarecky, D., Kwiatowski, J., Ayala, F.J.: Evidence for positive selection in the superoxide dismutase (Sod) region of Drosophila melanogaster. Genetics 136, 1329–1340 (1994)Google Scholar
  24. 24.
    Depaulis, F., Mousset, S., Veuille, M.: Haplotype tests using coalescent simulations conditional on the number of segregating sites. Mol. Biol. Evol. 18, 1136–1138 (2001)CrossRefGoogle Scholar
  25. 25.
    Innan, H., Zhang, K., Marjoram, P., Tavare, S., Rosenberg, N.A.: Statistical tests of the coalescent model based on the haplotype frequency distribution and the number of segregating sites. Genetics 169, 1763–1777 (2005)CrossRefGoogle Scholar
  26. 26.
    Sabeti, P.C., Reich, D.E., Higgins, J.M., Levine, H.Z., Richter, D.J., Schaffner, S.F., Gabriel, S.B., Platko, J.V., Patterson, N.J., McDonald, G.J., et al.: Detecting recent positive selection in the human genome from haplotype structure. Nature 419(6909), 832–837 (2002)CrossRefGoogle Scholar
  27. 27.
    Toomajian, C., Hu, T.T., Aranzana, M.J., Lister, C., Tang, C., Zheng, H., Zhao, K., Calabrese, P., Dean, C., Nordborg, M.: A nonparametric test reveals selection for rapid flowering in the Arabidopsis genome. PLoS Biol. 4, e137 (2006)CrossRefGoogle Scholar
  28. 28.
    Sabeti, P.C., et al.: Genome-wide detection and characterization of positive selection in human populations. Nature 449, 913–918 (2007)CrossRefGoogle Scholar
  29. 29.
    Fu, Y.X.: Statistical properties of segregating sites. Theor. Popul. Biol. 48, 172–197 (1995)CrossRefzbMATHGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Roy Ronen
    • 1
  • Glenn Tesler
    • 2
  • Ali Akbari
    • 3
  • Shay Zakov
    • 4
  • Noah A. Rosenberg
    • 5
  • Vineet Bafna
    • 4
    Email author
  1. 1.Bioinformatics Graduate ProgramUniversity of California, San DiegoLa JollaUSA
  2. 2.Department of MathematicsUniversity of California, San DiegoLa JollaUSA
  3. 3.Electrical and Computer EngineeringUniversity of California, San DiegoLa JollaUSA
  4. 4.Department of Computer Science and EngineeringUniversity of California, San DiegoLa JollaUSA
  5. 5.Department of BiologyStanford UniversityStanfordUSA

Personalised recommendations