Abstract
Copy number variation (CNV) is the most prevalent type of genetic structural variation that has been recognized as an important source of phenotypic variation in humans, animals and plants. However, the mechanisms underlying the evolution of CNVs and their function in natural or artificial selection remain unknown. Here, we generated CNV region (CNVR) datasets which were diverged or shared among cattle, goat, and sheep, including 886 individuals from 171 diverse populations. Using 9 environmental factors for genome-wide association study (GWAS), we identified a series of candidate CNVRs, including genes relating to immunity, tick resistance, multi-drug resistance, and muscle development. The number of CNVRs shared between species is significantly higher than expected (P<0.00001), and these CNVRs may be more persist than the single nucleotide polymorphisms (SNPs) shared between species. We also identified genomic regions under long-term balancing selection and uncovered the potential diversity of the selected CNVRs close to the important functional genes. This study provides the evidence that balancing selection might be more common in mammals than previously considered, and might play an important role in the daily activities of these ruminant species.
This is a preview of subscription content, access via your institution.
References
Abecasis, G.R., Altshuler, D., Auton, A., Brooks, L.D., Durbin, R.M., Gibbs, R.A., Hurles, M.E., and McVean, G.A. (2010). A map of human genome variation from population-scale sequencing. Nature 467, 1061–1073.
Abecasis, G.R., Auton, A., Brooks, L.D., DePristo, M.A., Durbin, R.M., Handsaker, R.E., Kang, H.M., Marth, G.T., and McVean, G.A. (2012). An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65.
Ahlawat, S., Sharma, P., Sharma, R., Arora, R., and De, S. (2016). Zinc finger domain of the PRDM9 gene on chromosome 1 exhibits high diversity in ruminants but its paralog PRDM7 contains multiple disruptive mutations. PLoS ONE 11, e0156159.
Alberto, F.J., Boyer, F., Orozco-terWengel, P., Streeter, I., Servin, B., de Villemereuil, P., Benjelloun, B., Librado, P., Biscarini, F., Colli, L., et al. (2018). Convergent genomic signatures of domestication in sheep and goats. Nat Commun 9, 813.
Auton, A., Brooks, L.D., Durbin, R.M., Garrison, E.P., Kang, H.M., Korbel, J.O., Marchini, J.L., McCarthy, S., McVean, G.A., and Abecasis, G.R. (2015). A global reference for human genetic variation. Nature 526, 68–74.
Baird, P.N., Robman, L.D., Richardson, A.J., Dimitrov, P.N., Tikellis, G., McCarty, C.A., and Guymer, R.H. (2008). Gene-environment interaction in progression of AMD: the CFH gene, smoking and exposure to chronic infection. Hum Mol Genet 17, 1299–1305.
Bickhart, D.M., Hou, Y., Schroeder, S.G., Alkan, C., Cardone, M.F., Matukumalli, L.K., Song, J., Schnabel, R.D., Ventura, M., Taylor, J.F., et al. (2012). Copy number variation of individual cattle genomes using next-generation sequencing. Genome Res 22, 778–790.
Bickhart, D.M., Xu, L., Hutchison, J.L., Cole, J.B., Null, D.J., Schroeder, S. G., Song, J., Garcia, J.F., Sonstegard, T.S., Van Tassell, C.P., et al. (2016). Diversity and population-genetic properties of copy number variations and multicopy genes in cattle. DNA Res 23, 253–262.
Busnelli, M., Manzini, S., Parolini, C., Escalante-Alcalde, D., and Chiesa, G. (2018). Lipid phosphate phosphatase 3 in vascular pathophysiology. Atherosclerosis 271, 156–165.
Chaisson, M.J., and Tesler, G. (2012). Mapping single molecule sequencing reads using basic local alignment with successive refinement (BLASR): application and theory. BMC Bioinf 13, 238.
Chen, L., Qiu, Q., Jiang, Y., Wang, K., Lin, Z., Li, Z., Bibi, F., Yang, Y., Wang, J., Nie, W., et al. (2019). Large-scale ruminant genome sequencing provides insights into their evolution and distinct traits. Science 364, eaav6202.
Chen, N., Cai, Y., Chen, Q., Li, R., Wang, K., Huang, Y., Hu, S., Huang, S., Zhang, H., Zheng, Z., et al. (2018). Whole-genome resequencing reveals world-wide ancestry and adaptive introgression events of domesticated cattle in East Asia. Nat Commun 9, 2337.
Davis, E., Jensen, C.H., Schroder, H.D., Farnir, F., Shay-Hadfield, T., Kliem, A., Cockett, N., Georges, M., and Charlier, C. (2004). Ectopic expression of DLK1 protein in skeletal muscle of padumnal heterozygotes causes the callipyge phenotype. Curr Biol 14, 1858–1862.
de Filippo, C., Key, F.M., Ghirotto, S., Benazzo, A., Meneu, J.R., Weihmann, A., Parra, G., Green, E.D., and Andrés, A.M. (2016). Recent selection changes in human genes under long-term balancing selection. Mol Biol Evol 33, 1435–1447.
Dharmadhikari, A.V., Kang, S.H.L., Szafranski, P., Person, R.E., Sampath, S., Prakash, S.K., Bader, P.I., Phillips, J.A., Hannig, V., Williams, M., et al. (2012). Small rare recurrent deletions and reciprocal duplications in 2q21.1, including brain-specific ARHGEF4 and GPR148. Hum Mol Genet 21, 3345–3355.
Dong, Y., Zhang, X., Xie, M., Arefnezhad, B., Wang, Z., Wang, W., Feng, S., Huang, G., Guan, R., Shen, W., et al. (2015). Reference genome of wild goat (Capra aegagrus) and sequencing of goat breeds provide insight into genic basis of goat domestication. BMC Genomics 16, 431.
Elsik, C.G., Tellam, R.L., Worley, K.C., Gibbs, R.A., Muzny, D.M., Weinstock, G.M., Adelson, D.L., Eichler, E.E., Elnitski, L., Guigó, R., et al. (2009). The genome sequence of taurine cattle: a window to ruminant biology and evolution. Science 324, 522–528.
Floris, C., Rassu, S., Boccone, L., Gasperini, D., Cao, A., and Crisponi, L. (2008). Two patients with balanced translocations and autistic disorder: CSMD3 as a candidate gene for autism found in their common 8q23 breakpoint area. Eur J Hum Genet 16, 696–704.
Fukumoto, T., Zhu, H., Nacarelli, T., Karakashev, S., Fatkhutdinov, N., Wu, S., Liu, P., Kossenkov, A.V., Showe, L.C., Jean, S., et al. (2019). N6-methylation of adenosine (m6A) of FZD10 mRNA contributes to PARP inhibitor resistance. Cancer Res 79, 2812–2820.
Gao, F., Ming, C., Hu, W., and Li, H. (2016). New software for the fast estimation of population recombination rates (FastEPRR) in the genomic era. G3 6, 1563–1571.
Gupta, M.K., and Vadde, R. (2019). Genetic basis of adaptation and maladaptation via balancing selection. Zoology 136, 125693.
Hackmann, T.J., and Spain, J.N. (2010). Invited review: ruminant ecology and evolution: perspectives useful to ruminant livestock research and production. J Dairy Sci 93, 1320–1334.
Hauswirth, R., Haase, B., Blatter, M., Brooks, S.A., Burger, D., Drogemuller, C., Gerber, V., Henke, D., Janda, J., Jude, R., et al. (2012). Mutations in MITF and PAX3 cause “splashed white” and other white spotting phenotypes in horses. PLoS Genet 8, e1002653.
Hindorff, L.A., Sethupathy, P., Junkins, H.A., Ramos, E.M., Mehta, J.P., Collins, F.S., and Manolio, T.A. (2009). Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci USA 106, 9362–9367.
International HapMap, C. (2005). A haplotype map of the human genome. Nature 437, 1299–1320.
Irvin, M.R., Wineinger, N.E., Rice, T.K., Pajewski, N.M., Kabagambe, E. K., Gu, C.C., Pankow, J., North, K.E., Wilk, J.B., Freedman, B.I., et al. (2011). Genome-wide detection of allele specific copy number variation associated with insulin resistance in African Americans from the HyperGEN study. PLoS ONE 6, e24052.
Jacobs, L.C., Hamer, M.A., Gunn, D.A., Deelen, J., Lall, J.S., van Heemst, D., Uh, H.W., Hofman, A., Uitterlinden, A.G., Griffiths, C.E.M., et al. (2015) A genome-wide association study identifies the skin color genes IRF4, MC1R, ASIP, and BNC2 influencing facial pigmented spots. J Invest Dermatol 135, 1735–1742.
Johnsen, J.M., Teschke, M., Pavlidis, P., McGee, B.M., Tautz, D., Ginsburg, D., and Baines, J F (2009) Selection on cis-regulatory variation at B4galnt2 and its influence on von Willebrand factor in house mice. Mol Biol Evol 26, 567–578.
Kang, H.M., Sul, J.H., Service, S.K., Zaitlen, N.A., Kong, S.Y., Freimer, N. B., Sabatti, C., and Eskin, E. (2010). Variance component model to account for sample structure in genome-wide association studies. Nat Genet 42, 348–354.
Kojo, S., Tanaka, H., Endo, T.A., Muroi, S., Liu, Y., Seo, W., Tenno, M., Kakugawa, K., Naoe, Y., Nair, K., et al. (2017). Priming of lineage-specifying genes by Bcl11b is required for lineage choice in post-selection thymocytes. Nat Commun 8, 702.
Kong, Y., Zhao, L., Charette, J.R., Hicks, W.L., Stone, L., Nishina, P.M., and Naggert, J.K. (2018) An FRMD4B variant suppresses dysplastic photoreceptor lesions in models of enhanced S-cone syndrome and of Nrl deficiency. Hum Mol Genet 27, 3340–3352.
Koren, S., Rhie, A., Walenz, B.P., Dilthey, A.T., Bickhart, D.M., Kingan, S. B., Hiendleder, S., Williams, J.L., Smith, T.P.L., and Phillippy, A.M. (2018). De novo assembly of haplotype-resolved genomes with trio binning. Nat Biotechnol 36, 1174–1182.
Leffler, E.M., Gao, Z., Pfeifer, S., Ségurel, L., Auton, A., Venn, O., Bowden, R., Bontrop, R., Wall, J.D., Sella, G., et al. (2013). Multiple instances of ancient balancing selection shared between humans and chimpanzees. Science 339, 1578–1582.
Li, Y., Chu, J., Feng, W., Yang, M., Zhang, Y., Zhang, Y., Qin, Y., Xu, J., Li, J., Vasilatos, S.N., et al. (2019). EPHA5 mediates trastuzumab resistance in HER2-positive breast cancers through regulating cancer stem cell-like properties. FASEB J 33, 4851–4865.
Liu, G.E., Ventura, M., Cellamare, A., Chen, L., Cheng, Z., Zhu, B., Li, C., Song, J., and Eichler, E.E. (2009). Analysis of recent segmental duplications in the bovine genome. BMC Genomics 10, 571.
Liu, G.E., Hou, Y., Zhu, B., Cardone, M.F., Jiang, L., Cellamare, A., Mitra, A., Alexander, L.J., Coutinho, L.L., Dell’Aquila, M.E., et al. (2010). Analysis of copy number variations among diverse cattle breeds. Genome Res 20, 693–703.
Lv, F.H., Agha, S., Kantanen, J., Colli, L., Stucki, S., Kijas, J.W., Joost, S., Li, M.H., and Ajmone Marsan, P. (2014). Adaptations to climate-mediated selective pressures in sheep. Mol Biol Evol 31, 3324–3343.
Ma, L., O’Connell, J.R., VanRaden, P.M., Shen, B., Padhi, A., Sun, C., Bickhart, D.M., Cole, J.B., Null, D.J., Liu, G.E., et al. (2015) Cattle sex-specific recombination and genetic control from a large pedigree analysis. PLoS Genet 11, e1005387.
Ma, Y., Chen, C., Wang, Y., Wu, L., He, F., Chen, C., Zhang, C., Deng, X., Yang, L., Chen, Y., et al. (2016) Analysis copy number variation of Chinese children in early-onset epileptic encephalopathies with unknown cause. Clin Genet 90, 428–436.
Nattestad, M., and Schatz, M.C. (2016). Assemblytics: a web analytics tool for the detection of variants from an assembly. Bioinformatics 32, 3021–3023.
Naval-Sanchez, M., Nguyen, Q., McWilliam, S., Porto-Neto, L.R., Tellam, R., Vuocolo, T., Reverter, A., Perez-Enciso, M., Brauning, R., Clarke, S., et al. (2018). Sheep genome functional annotation reveals proximal regulatory elements contributed to the evolution of modern breeds. Nat Commun 9, 859.
Norris, B.J., and Whan, V.A. (2008). A gene duplication affecting expression of the ovine ASIP gene is responsible for white and black sheep. Genome Res 18, 1282–1293.
Patterson, N., Price, A.L., and Reich, D. (2006). Population structure and eigenanalysis. PLoS Genet 2, e190.
Perry, G.H., Tchinda, J., McGrath, S.D., Zhang, J., Picker, S.R., Cáceres, A. M., Iafrate, A.J., Tyler-Smith, C., Scherer, S.W., Eichler, E.E., et al. (2006). Hotspots for copy number variation in chimpanzees and humans. Proc Natl Acad Sci USA 103, 8006–8011.
Petrovski, S., Wang, Q., Heinzen, E.L., Allen, A.S., and Goldstein, D.B. (2013). Genic intolerance to functional variation and the interpretation of personal genomes. PLoS Genet 9, e1003709.
Quinlan, A.R., and Hall, I.M. (2010). BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842.
Raman, S., Beilschmidt, M., To, M., Lin, K., Lui, F., Jmeian, Y., Ng, M., Fernandez, M., Fu, Y., Mascall, K., et al. (2019). Structure-guided design fine-tunes pharmacokinetics, tolerability, and antitumor profile of multispecific frizzled antibodies. Proc Natl Acad Sci USA 116, 6812–6817.
Redon, R., Ishikawa, S., Fitch, K.R., Feuk, L., Perry, G.H., Andrews, T.D., Fiegler, H., Shapero, M.H., Carson, A.R., Chen, W., et al. (2006). Global variation in copy number in the human genome. Nature 444, 444–454.
Reimann, F., and Ashcroft, F.M. (1999). Inwardly rectifying potassium channels. Curr Opin Cell Biol 11, 503–508.
Repping, S., van Daalen, S.K.M., Brown, L.G., Korver, C.M., Lange, J., Marszalek, J.D., Pyntikova, T., van der Veen, F., Skaletsky, H., Page, D. C., et al. (2006). High mutation rates have driven extensive structural polymorphism among human Y chromosomes. Nat Genet 38, 463–467.
Sarowar, T., Chhabra, R., Vilella, A., Boeckers, T.M., Zoli, M., and Grabrucker, A.M. (2016). Activity and circadian rhythm influence synaptic Shank3 protein levels in mice. J Neurochem 138, 887–895.
Schmittgen, T.D., and Livak, K.J. (2008). Analyzing real-time PCR data by the comparative CT method. Nat Protoc 3, 1101–1108.
Sedlazeck, F.J., Rescheneder, P., Smolka, M., Fang, H., Nattestad, M., von Haeseler, A., and Schatz, M.C. (2018). Accurate detection of complex structural variations using single-molecule sequencing. Nat Methods 15, 461–468.
Ségurel, L., Thompson, E.E., Flutre, T., Lovstad, J., Venkat, A., Margulis, S.W., Moyse, J., Ross, S., Gamble, K., Sella, G., et al. (2012). The ABO blood group is a trans-species polymorphism in primates. Proc Natl Acad Sci USA 109, 18493–18498.
Sharp, A.J., Locke, D.P., McGrath, S.D., Cheng, Z., Bailey, J.A., Vallente, R.U., Pertz, L.M., Clark, R.A., Schwartz, S., Segraves, R., et al. (2005). Segmental duplications and copy-number variation in the human genome. Am J Hum Genets 77, 78–88.
Shenoy, A.R., Wellington, D.A., Kumar, P., Kassa, H., Booth, C.J., Cresswell, P., and MacMicking, J.D. (2012). GBP5 promotes NLRP3 inflammasome assembly and immunity in mammals. Science 336, 481–485.
Siewert, K.M., and Voight, B.F. (2020). BetaScan2: Standardized statistics to detect balancing selection utilizing substitution data. Genome Biol Evol 12, 3873–3877.
Simpson, J.K., Martinez-Queipo, M., Onoufriadis, A., Tso, S., Glass, E., Liu, L., Higashino, T., Scott, W., Tierney, C., Simpson, M.A., et al. (2020). Genotype-phenotype correlation in a large English cohort of patients with autosomal recessive ichthyosis. Br J Dermatol 182, 729–737.
Singhal, S., Leffler, E.M., Sannareddy, K., Turner, I., Venn, O., Hooper, D. M., Strand, A.I., Li, Q., Raney, B., Balakrishnan, C.N., et al. (2015). Stable recombination hotspots in birds. Science 350, 928–932.
Smyth, G.K., and Speed, T. (2003). Normalization of cDNA microarray data. Methods 31, 265–273.
Snider, J., Thibault, G., and Houry, W.A. (2008). The AAA+ superfamily of functionally diverse proteins. Genome Biol 9, 216.
Stamatakis, A. (2014). RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30, 1312–1313.
Sudmant, P.H., Mallick, S., Nelson, B.J., Hormozdiari, F., Krumm, N., Huddleston, J., Coe, B.P., Baker, C., Nordenfelt, S., Bamshad, M., et al. (2015). Global diversity, population stratification, and selection of human copy-number variation. Science 349, aab3761.
Sung, Y.J., Pérusse, L., Sarzynski, M.A., Fornage, M., Sidney, S., Sternfeld, B., Rice, T., Terry, J.G., Jacobs Jr, D.R., Katzmarzyk, P., et al. (2016). Genome-wide association studies suggest sex-specific loci associated with abdominal and visceral fat. Int J Obes 40, 662–674.
Taberlet, P., Coissac, E., Pansu, J., and Pompanon, F. (2011). Conservation genetics of cattle, sheep, and goats. Compt Rend Biol 334, 247–254.
Tajima, F. (1989). Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123, 585–595.
Thiesen, S., Kübart, S., Ropers, H.H., and Nothwang, H.G. (2000). Isolation of two novel human RhoGEFs, ARHGEF3 and ARHGEF4, in 3p13–21 and 2q22. Biochem Biophys Res Commun 273, 364–369.
Vavvas, D.G., Small, K.W., Awh, C.C., Zanke, B.W., Tibshirani, R.J., and Kustra, R. (2018). CFH and ARMS2 genetic risk determines progression to neovascular age-related macular degeneration after antioxidant and zinc supplementation. Proc Natl Acad Sci USA 115, E696–E704.
Vilà, C., Seddon, J., and Ellegren, H. (2005). Genes of domestic mammals augmented by backcrossing with wild ancestors. Trends Genets 21, 214–218.
Walter, K., Min, J.L., Huang, J., Crooks, L., Memari, Y., McCarthy, S., Perry, J.R.B., Xu, C.J., Futema, M., Lawson, D., et al. (2015). The UK10K project identifies rare variants in health and disease. Nature 526, 82–90.
Wang, B., Chen, L., and Wang, W. (2019a). Genomic insights into ruminant evolution: from past to future prospects. Zool Res 40, 476–487.
Wang, K., Li, M., and Hakonarson, H. (2010). ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res 38, e164.
Wang, X., Liu, J., Niu, Y., Li, Y., Zhou, S., Li, C., Ma, B., Kou, Q., Petersen, B., Sonstegard, T., et al. (2018). Low incidence of SNVs and indels in trio genomes of Cas9-mediated multiplex edited sheep. BMC Genomics 19, 397.
Wang, X., Zheng, Z., Cai, Y., Chen, T., Li, C., Fu, W., and Jiang, Y. (2017). CNVcaller: highly efficient and widely applicable software for detecting copy number variations in large populations. GigaScience 6.
Wang, Y., Zhang, C., Wang, N., Li, Z., Heller, R., Liu, R., Zhao, Y., Han, J., Pan, X., Zheng, Z., et al. (2019b). Genetic basis of ruminant headgear and rapid antler regeneration. Science 364, eaav6335.
Wang, Y.H., Reverter, A., Kemp, D., McWilliam, S.M., Ingham, A., Davis, C.A., Moore, R.J., and Lehnert, S.A. (2007). Gene expression profiling of Hereford Shorthorn cattle following challenge with Boophilus microplus tick larvae. Aust J Exp Agric 47, 1397–1407.
Wu, J., Saupe, S.J., and Glass, N.L. (1998). Evidence for balancing selection operating at the het-c heterokaryon incompatibility locus in a group of filamentous fungi. Proc Natl Acad Sci USA 95, 12398–12403.
Wu, Q., Han, T.S., Chen, X., Chen, J.F., Zou, Y.P., Li, Z.W., Xu, Y.C., and Guo, Y.L. (2017). Long-term balancing selection contributes to adaptation in Arabidopsis and its relatives. Genome Biol 18, 217.
Xu, J., Shetty, P.B., Feng, W., Chenault, C., Bast Jr R.C., Issa, J.P.J., Hilsenbeck, S.G., and Yu, Y. (2012). Methylation of HIN-1, RASSF1A, RIL and CDH13 in breast cancer is associated with clinical characteristics, but only RASSF1A methylation is associated with outcome. BMC Cancer 12, 243.
Yang, J., Niu, H., Huang, Y., and Yang, K. (2016). A systematic analysis of the relationship of CDH13 promoter methylation and breast cancer risk and prognosis. PLoS ONE 11, e0149185.
Yang, Z. (2007). PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol 24, 1586–1591.
Zhang, F., Gu, W., Hurles, M.E., and Lupski, J.R. (2009). Copy number variation in human health, disease, and evolution. Annu Rev Genom Hum Genet 10, 451–481.
Zhang, X., Xu, Y., Liu, D., Geng, J., Chen, S., Jiang, Z., Fu, Q., and Sun, K. (2015a). A modified multiplex ligation-dependent probe amplification method for the detection of 22q11.2 copy number variations in patients with congenital heart disease. BMC Genomics 16, 364.
Zhang, Z., Li, C., Wu, F., Ma, R., Luan, J., Yang, F., Liu, W., Wang, L., Zhang, S., Liu, Y., et al. (2015b). Genomic variations of the mevalonate pathway in porokeratosis. eLife 4, e06322.
Zuk, O., Schaffner, S.F., Samocha, K., Do, R., Hechter, E., Kathiresan, S., Daly, M.J., Neale, B.M., Sunyaev, S.R., and Lander, E.S. (2014). Searching for missing heritability: designing rare variant association studies. Proc Natl Acad Sci USA 111, E455–E464.
Acknowledgements
This work was supported by the National Natural Science Foundation of China (31822052, 31572381) and the National Thousand Youth Talents Plan, and the Program of the National Beef Cattle and Yak Industrial Technology System (CARS-37). We thank the High-Performance Computing platform of Northwest A&F University. We thank Yu Wang, Xiangyu Pan, Ming Li, Xiaomeng Tian, Dongke Zhou, Zhirui Yang, Han Xu, Chunna Cao and other members of the genome of big data laboratory for discussions. We also thank members of the NextGen project for sharing their data.
Author information
Affiliations
Corresponding author
Additional information
Compliance and ethics
The author(s) declare that they have no conflict of interest. Animal care and the experiments were conducted according to the guidelines established by the Regulations for the Administration of Affairs Concerning Experimental Animals (Ministry of Science and Technology, China, 2004) and approved by the Institutional Animal Care and Use Committee (College of Animal Science and Technology, Northwest A&F University, China). Every effort was made to minimize animal pain, suffering, and distress and to reduce the number of animals used.
Electronic Supplementary Material
Rights and permissions
About this article
Cite this article
Huang, Y., Li, Y., Wang, X. et al. An atlas of CNV maps in cattle, goat and sheep. Sci. China Life Sci. (2021). https://doi.org/10.1007/s11427-020-1850-x
Received:
Accepted:
Published:
Keywords
- copy number variation
- species-shared
- balancing selection
- ruminant livestock