Advertisement

Computational detection and experimental validation of segmental duplications and associated copy number variations in water buffalo ( Bubalus bubalis )

  • Shuli Liu
  • Xiaolong Kang
  • Claudia R. Catacchio
  • Mei Liu
  • Lingzhao Fang
  • Steven G. Schroeder
  • Wenli Li
  • Benjamin D. Rosen
  • Daniela Iamartino
  • Leopoldo Iannuzzi
  • Tad S. Sonstegard
  • Curtis P. Van Tassell
  • Mario Ventura
  • Wai Yee Low
  • John L. Williams
  • Derek M. BickhartEmail author
  • George E. LiuEmail author
Original Article
  • 27 Downloads

Abstract

Duplicated sequences are an important source of gene evolution and structural variation within mammalian genomes. Using a read depth approach based on next-generation sequencing, we performed a genome-wide analysis of segmental duplications (SDs) and associated copy number variations (CNVs) in the water buffalo (Bubalus bubalis). By aligning short reads of Olimpia (the reference water buffalo) to the UMD3.1 cattle genome, we identified 1,038 segmental duplications comprising 44.6 Mb (equivalent to ~1.73% of the cattle genome) of the autosomal and X chromosomal sequence in the buffalo genome. We experimentally validated 70.3% (71/101) of these duplications using fluorescent in situ hybridization. We also detected a total of 1,344 CNV regions across 14 additional water buffaloes, amounting to 59.8 Mb of variable sequence or the equivalent of 2.2% of the cattle genome. The CNV regions overlap 1,245 genes that are significantly enriched for specific biological functions including immune response, oxygen transport, sensory system and signal transduction. Additionally, we performed array Comparative Genomic Hybridization (aCGH) experiments using the 14 water buffaloes as test samples and Olimpia as the reference. Using a linear regression model, a high Pearson correlation (r = 0.781) was observed between the log2 ratios between copy number estimates and the log2 ratios of aCGH probes. We further designed Quantitative PCR assays to confirm CNV regions within or near annotated genes and found 74.2% agreement with our CNV predictions. These results confirm sub-chromosome-scale structural rearrangements present in the cattle and water buffalo. The information on genome variation that will be of value for evolutionary and phenotypic studies, and may be useful for selective breeding of both species.

Keywords

Segmental duplication Copy number variation Bubalus bubalis Fluorescent in situ hybridization Array Comparative Genomic Hybridization Quantitative PCR 

Abbreviations

aCGH

array Comparative Genomic Hybridization

BoLA

Bovine Leucocyte Antigens

BTF3

basic transcription factor 3

CN

copy number

CNVRs

CNV regions

CNVs

copy number variations

CT

cycle thresholds

DEFB

β-Defensin

FCGR3A

Fc fragment of IgG receptor IIIa

FDR

false discovery rate

FZD3

frizzled class receptor 3

FISH

fluorescence in situ hybridization

HTS

high throughput sequencing

KLRK1

killer cell lectin like receptor K1

LOESS

Locally Weighted Scatter-plot Smoother

MAD2L1

mitotic arrest deficient 2 like 1

MHC

major histocompatibility complex

OR

olfactory receptor

PAG

pregnancy-associated glycoprotein

PI3

peptidase inhibitor 3

RD

read depth

RP

Read pair

SA

sequence assembly

SDs

segmental duplications

SNPs

single nucleosome polymorphisms

SR

split read

STDEVs

standard deviations

TRAV

T cell receptor alpha variable

ULBP3

UL16 binding protein 3

WSSD

whole genome shotgun sequence detection

Notes

Acknowledgements

We thank Reuben Anderson and Alexandre Dimtchev for technical assistance. Mention of trade names or commercial products in this article is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the US Department of Agriculture. The USDA is an equal opportunity provider and employer.

Author Contributions

DMB and GEL conceived and designed the experiments. JLW, DI, LI, SGS, TSS, CPVT, CRC, and MV collected samples and/or generated HTS and FISH data. DMB, SL, XK, ML, and BDR performed computational and statistical analyses for HTS, aCGH and qPCR. SL, DMB and GEL wrote the paper. All authors read and approved the final manuscript.

Funding

GEL was partially supported by appropriated project 1265-3200-083-00D from the USDA Agricultural Research Service (Beltsville Agricultural Research Center), AFRI grant number 2013-67015-20951 from the USDA National Institute of Food and Agriculture (NIFA) Animal Genome and Reproduction Programs, and BARD grant number US-4997-17 from the US-Israel Binational Agricultural Research and Development (BARD) Fund. WL and DMB were supported by appropriated project 5090-31000-024-00-D from the USDA Agriculture Research Service (Dairy Forage Research Center). WYL and JLW are funded by the JS Davies Bequest to the University of Adelaide.

Compliance with ethical standards

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Supplementary material

10142_2019_657_MOESM1_ESM.pdf (1.2 mb)
ESM 1 (PDF 1257 kb)
10142_2019_657_MOESM2_ESM.xlsx (570 kb)
ESM 2 (XLSX 570 kb)

References

  1. Alkan C, Kidd JM, Marques-Bonet T, Aksay G, Antonacci F, Hormozdiari F, Kitzman JO, Baker C, Malig M, Mutlu O, Sahinalp SC, Gibbs RA, Eichler EE (2009) Personalized copy number and segmental duplication maps using next-generation sequencing. Nat Genet 41(10):1061–1067Google Scholar
  2. Belaaouaj A, McCarthy R, Baumann M, Gao Z, Ley TJ, Abraham SN, Shapiro SD (1998) Mice lacking neutrophil elastase reveal impaired host defense against gram negative bacterial sepsis. Nat Med 4(5):615–618Google Scholar
  3. Bickhart DM, Liu GE (2014) The challenges and importance of structural variation detection in livestock. Front Genet 5:37Google Scholar
  4. Bickhart DM, Hou Y, Schroeder SG, Alkan C, Cardone MF, Matukumalli LK, Song J, Schnabel RD, Ventura M, Taylor JF, Garcia JF, van Tassell CP, Sonstegard TS, Eichler EE, Liu GE (2012) Copy number variation of individual cattle genomes using next-generation sequencing. Genome Res 22(4):778–790Google Scholar
  5. Bickhart DM, Xu L, Hutchison JL, Cole JB, Null DJ, Schroeder SG, Song J, Garcia JF, Sonstegard TS, Van Tassell CP et al (2016) Diversity and population-genetic properties of copy number variations and multicopy genes in cattle. DNA research : an international journal for rapid publication of reports on genes and genomes 23(3):253–262Google Scholar
  6. Brown KH, Dobrinski KP, Lee AS, Gokcumen O, Mills RE, Shi X, Chong WW, Chen JY, Yoo P, David S et al (2012) Extensive genetic diversity and substructuring among zebrafish strains revealed through copy number variant analysis. Proc Natl Acad Sci U S A 109(2):529–534Google Scholar
  7. Cheng Z, Ventura M, She X, Khaitovich P, Graves T, Osoegawa K, Church D, DeJong P, Wilson RK, Paabo S et al (2005) A genome-wide comparison of recent chimpanzee and human segmental duplications. Nature 437(7055):88–93Google Scholar
  8. Colli L, Milanesi M, Vajana E, Iamartino D, Bomba L, Puglisi F, Del Corvo M, Nicolazzi EL, Ahmed SSE, Herrera JRV et al (2018) New Insights on Water Buffalo Genomic Diversity and Post-Domestication Migration Routes From Medium Density SNP Chip Data. Front Genet 9:53Google Scholar
  9. Connelley TK, Degnan K, Longhi CW, Morrison WI (2014) Genomic analysis offers insights into the evolution of the bovine TRA/TRD locus. BMC Genomics 15:994Google Scholar
  10. Doan R, Cohen N, Harrington J, Veazey K, Juras R, Cothran G, McCue ME, Skow L, Dindot SV (2012) Identification of copy number variants in horses. Genome Res 22(5):899–907Google Scholar
  11. Fontanesi L, Martelli PL, Beretti F, Riggio V, Dall'Olio S, Colombo M, Casadio R, Russo V, Portolano B (2010) An initial comparative map of copy number variations in the goat (Capra hircus) genome. BMC Genomics 11:639Google Scholar
  12. Fontanesi L, Beretti F, Martelli PL, Colombo M, Dall'olio S, Occidente M, Portolano B, Casadio R, Matassino D, Russo V (2011) A first comparative map of copy number variations in the sheep genome. Genomics 97(3):158–165Google Scholar
  13. Fujishima S, Morisaki H, Ishizaka A, Kotake Y, Miyaki M, Yoh K, Sekine K, Sasaki J, Tasaka S, Hasegawa N, Kawai Y, Takeda J, Aikawa N (2008) Neutrophil elastase and systemic inflammatory response syndrome in the initiation and development of acute lung injury among critically ill patients. Biomedicine & pharmacotherapy = Biomedecine & pharmacotherapie 62(5):333–338Google Scholar
  14. Gokcumen O, Lee C (2009) Copy number variants (CNVs) in primate species using array-based comparative genomic hybridization. Methods 49(1):18–25Google Scholar
  15. Hach F, Hormozdiari F, Alkan C, Hormozdiari F, Birol I, Eichler EE, Sahinalp SC (2010) mrsFAST: a cache-oblivious algorithm for short-read mapping. Nat Methods 7(8):576–577Google Scholar
  16. Handsaker RE, Van DV, Berman JR, Genovese G, Kashin S, Boettger LM, SA MC (2015) Large multiallelic copy number variations in humans. Nat Genet 47(3):296–303Google Scholar
  17. Henrichsen CN, Vinckenbosch N, Zollner S, Chaignat E, Pradervand S, Schutz F, Ruedi M, Kaessmann H, Reymond A (2009) Segmental copy number variation shapes tissue transcriptomes. Nat Genet 41(4):424–429Google Scholar
  18. Hou Y, Liu GE, Bickhart DM, Cardone MF, Wang K, Kim ES, Matukumalli LK, Ventura M, Song J, VanRaden PM et al (2011) Genomic characteristics of cattle copy number variations. BMC Genomics 12:127Google Scholar
  19. Iamartino D, Nicolazzi EL, Van Tassell CP, Reecy JM, Fritz-Waters ER, Koltes JE, Biffani S, Sonstegard TS, Schroeder SG, Ajmone-Marsan P et al (2017) Design and validation of a 90K SNP genotyping assay for the water buffalo (Bubalus bubalis). PLoS One 12(10):e0185220Google Scholar
  20. Jiang J, Wang J, Wang H, Zhang Y, Kang H, Feng X, Wang J, Yin Z, Bao W, Zhang Q, Liu JF (2014) Global copy number analyses by next generation sequencing provide insight into pig genome variation. BMC Genomics 15:593Google Scholar
  21. Kato T, Daigo Y, Aragaki M, Ishikawa K, Sato M, Kondo S, Kaji M (2011) Overexpression of MAD2 predicts clinical outcome in primary lung cancer patients. Lung Cancer 74(1):124–131Google Scholar
  22. Klambauer G, Schwarzbauer K, Mayr A, Clevert DA, Mitterecker A, Bodenhofer U, Hochreiter S cn.MOPS: mixture of Poissons for discovering copy number variations in next-generation sequencing data with a low false discovery rate. Nucleic Acids Res, 2012 40(9):e69Google Scholar
  23. Li W, Olivier M (2013) Current analysis platforms and methods for detecting copy number variation. Physiol Genomics 45(1):1–16Google Scholar
  24. Li W, Bickhart DM, Ramunno L, Iamartino D, Williams JL, Liu GE (2018) Comparative sequence alignment reveals River Buffalo genomic structural differences compared with cattle. Genomics.  https://doi.org/10.1016/j.ygeno.2018.02.018
  25. Liu GE, Ventura M, Cellamare A, Chen L, Cheng Z, Zhu B, Li C, Song J, Eichler EE (2009) Analysis of recent segmental duplications in the bovine genome. BMC Genomics 10:571Google Scholar
  26. Liu GE, Hou Y, Zhu B, Cardone MF, Jiang L, Cellamare A, Mitra A, Alexander LJ, Coutinho LL, Dell'aquila ME et al (2010) Analysis of copy number variations among diverse cattle breeds. Genome Res 20(5):693–703Google Scholar
  27. Lucas Lledo JI, Caceres M (2013) On the power and the systematic biases of the detection of chromosomal inversions by paired-end genome sequencing. PLoS One 8(4):e61292Google Scholar
  28. Mi H, Huang X, Muruganujan A, Tang H, Mills C, Kang D, Thomas PD (2017) PANTHER version 11: expanded annotation data from Gene Ontology and Reactome pathways, and data analysis tool enhancements. Nucleic Acids Res 45(D1):D183–D189Google Scholar
  29. Michelizzi VN, Dodson MV, Pan Z, Amaral ME, Michal JJ, McLean DJ, Womack JE, Jiang Z (2010) Water buffalo genome science comes of age. Int J Biol Sci 6(4):333–349Google Scholar
  30. Mills RE, Walter K, Stewart C, Handsaker RE, Chen K, Alkan C, Abyzov A, Yoon SC, Ye K, Cheetham RK, Chinwalla A, Conrad DF, Fu Y, Grubert F, Hajirasouliha I, Hormozdiari F, Iakoucheva LM, Iqbal Z, Kang S, Kidd JM, Konkel MK, Korn J, Khurana E, Kural D, Lam HY, Leng J, Li R, Li Y, Lin CY, Luo R, Mu XJ, Nemesh J, Peckham HE, Rausch T, Scally A, Shi X, Stromberg MP, Stütz AM, Urban AE, Walker JA, Wu J, Zhang Y, Zhang ZD, Batzer MA, Ding L, Marth GT, McVean G, Sebat J, Snyder M, Wang J, Ye K, Eichler EE, Gerstein MB, Hurles ME, Lee C, McCarroll S, Korbel JO, 1000 Genomes Project (2011) Mapping copy number variation by population-scale genome sequencing. Nature 470(7332):59–65Google Scholar
  31. Nath S, Moghe M, Chowdhury A, Godbole K, Godbole G, Doiphode M, Roychoudhury S (2012) Is germline transmission of MAD2 gene deletion associated with human fetal loss? Mol Hum Reprod 18(11):554–562Google Scholar
  32. Nicholas TJ, Cheng Z, Ventura M, Mealey K, Eichler EE, Akey JM (2009) The genomic architecture of segmental duplications and associated copy number variants in dogs. Genome Res 19(3):491–499Google Scholar
  33. Nikolich-Zugich J, Slifka MK, Messaoudi I (2004) The many important facets of T-cell repertoire diversity. Nat Rev Immunol 4(2):123–132Google Scholar
  34. Oldeschulte DL, Halley YA, Wilson ML, Bhattarai EK, Brashear W, Hill J, Metz RP, Johnson CD, Rollins D, Peterson MJ, Bickhart DM, Decker JE, Sewell JF, Seabury CM (2017) Annotated draft genome assemblies for the Northern Bobwhite (colinus virginianus) and the scaled quail (callipepla squamata) reveal disparate estimates of modern genome diversity and historic effective population size. G3 7(9):3047–3058Google Scholar
  35. Pinto D, Darvishi K, Shi XH, Rajan D, Rigler D, Fitzgerald T, Lionel AC, Thiruvahindrapuram B, MacDonald JR, Mills R et al (2011) Comprehensive assessment of array-based platforms and calling algorithms for detection of copy number variants. Nat Biotechnol 29(6):512–U576Google Scholar
  36. Sedlazeck FJ, Rescheneder P, Smolka M, Fang H, Nattestad M, von Haeseler A, Schatz MC (2018) Accurate detection of complex structural variations using single-molecule sequencing. Nat Methods 15(6):461–468Google Scholar
  37. Snijders AM, Nowak N, Segraves R, Blackwood S, Brown N, Conroy J, Hamilton G, Hindle AK, Huey B, Kimura K, Law S, Myambo K, Palmer J, Ylstra B, Yue JP, Gray JW, Jain AN, Pinkel D, Albertson DG (2001) Assembly of microarrays for genome-wide measurement of DNA copy number. Nat Genet 29(3):263–264Google Scholar
  38. Snyder M, Du J, Gerstein M (2010) Personal genome sequencing: current approaches and challenges. Genes Dev 24(5):423–431Google Scholar
  39. Sudmant PH, Kitzman JO, Antonacci F, Alkan C, Malig M, Tsalenko A, Sampas N, Bruhn L, Shendure J, Genomes P et al (2010) Diversity of human copy number variation and multicopy genes. Science 330(6004):641–646Google Scholar
  40. Sudmant PH, Mallick S, Nelson BJ, Hormozdiari F, Krumm N, Huddleston J, Coe BP, Baker C, Nordenfelt S, Bamshad M, Jorde LB, Posukh OL, Sahakyan H, Watkins WS, Yepiskoposyan L, Abdullah MS, Bravi CM, Capelli C, Hervig T, Wee JTS, Tyler-Smith C, van Driem G, Romero IG, Jha AR, Karachanak-Yankova S, Toncheva D, Comas D, Henn B, Kivisild T, Ruiz-Linares A, Sajantila A, Metspalu E, Parik J, Villems R, Starikovskaya EB, Ayodo G, Beall CM, di Rienzo A, Hammer MF, Khusainova R, Khusnutdinova E, Klitz W, Winkler C, Labuda D, Metspalu M, Tishkoff SA, Dryomov S, Sukernik R, Patterson N, Reich D, Eichler EE (2015) Global diversity, population stratification, and selection of human copy-number variation. Science 349(6253):aab3761Google Scholar
  41. Untergasser A, Cutcutache I, Koressaar T, Ye J, Faircloth BC, Remm M, Rozen SG (2012) Primer3--new capabilities and interfaces. Nucleic Acids Res 40(15):e115Google Scholar
  42. Vivier E, Tomasello E, Paul P (2002) Lymphocyte activation via NKG2D: towards a new paradigm in immune recognition? Curr Opin Immunol 14(3):306–311Google Scholar
  43. Wallace RM, Pohler KG, Smith MF, Green JA (2015) Placental PAGs: gene origins, expression patterns, and use as markers of pregnancy. Reproduction 149(3):R115–R126Google Scholar
  44. Wang Y, Thekdi N, Smallwood PM, Macke JP, Nathans J (2002) Frizzled-3 is required for the development of major fiber tracts in the rostral CNS. J Neurosci 22(19):8563–8573Google Scholar
  45. Whitacre LK, Hoff JL, Schnabel RD, Albarella S, Ciotola F, Peretti V, Strozzi F, Ferrandi C, Ramunno L, Sonstegard TS, Williams JL, Taylor JF, Decker JE (2017) Elucidating the genetic basis of an oligogenic birth defect using whole genome sequence data in a non-model organism. Bubalus bubalis Scientific reports 7:39719Google Scholar
  46. Williams JL, Iamartino D, Pruitt KD, Sonstegard T, Smith TPL, Low WY, Biagini T, Bomba L, Capomaccio S, Castiglioni B, Coletta A, Corrado F, Ferré F, Iannuzzi L, Lawley C, Macciotta N, McClure M, Mancini G, Matassino D, Mazza R, Milanesi M, Moioli B, Morandi N, Ramunno L, Peretti V, Pilla F, Ramelli P, Schroeder S, Strozzi F, Thibaud-Nissen F, Zicarelli L, Ajmone-Marsan P, Valentini A, Chillemi G, Zimin A (2017) Genome assembly and transcriptome resource for river buffalo, Bubalus bubalis (2n = 50). GigaScience 6(10):1–6Google Scholar
  47. Yi G, Qu L, Liu J, Yan Y, Xu G, Yang N (2014) Genome-wide patterns of copy number variation in the diversified chicken genomes using next-generation sequencing. BMC Genomics 15:962Google Scholar
  48. Zhang Y, Sun D, Yu Y, Zhang Y (2007) Genetic diversity and differentiation of Chinese domestic buffalo based on 30 microsatellite markers. Anim Genet 38(6):569–575Google Scholar
  49. Zhang L, Jia S, Yang M, Xu Y, Li C, Sun J, Huang Y, Lan X, Lei C, Zhou Y, Zhang C, Zhao X, Chen H (2014) Detection of copy number variations and their effects in Chinese bulls. BMC Genomics 15:480Google Scholar
  50. Zhou Y, Utsunomiya YT, Xu L, el HA H, Bickhart DM, Sonstegard TS, Van Tassell CP, Garcia JF, Liu GE (2016) Comparative analyses across cattle genders and breeds reveal the pitfalls caused by false positive and lineage-differential copy number variations. Sci Rep 6:29219Google Scholar
  51. Zimin AV, Delcher AL, Florea L, Kelley DR, Schatz MC, Puiu D, Hanrahan F, Pertea G, Van Tassell CP, Sonstegard TS et al (2009) A whole-genome assembly of the domestic cow, Bos taurus. Genome Biol 10(4):R42Google Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  • Shuli Liu
    • 1
    • 2
  • Xiaolong Kang
    • 1
    • 3
  • Claudia R. Catacchio
    • 4
  • Mei Liu
    • 1
    • 5
  • Lingzhao Fang
    • 1
    • 6
  • Steven G. Schroeder
    • 1
  • Wenli Li
    • 7
  • Benjamin D. Rosen
    • 1
  • Daniela Iamartino
    • 8
    • 9
  • Leopoldo Iannuzzi
    • 10
  • Tad S. Sonstegard
    • 11
  • Curtis P. Van Tassell
    • 1
  • Mario Ventura
    • 4
  • Wai Yee Low
    • 12
  • John L. Williams
    • 12
  • Derek M. Bickhart
    • 7
    Email author
  • George E. Liu
    • 1
    Email author
  1. 1.USDA-ARS, Animal Genomics and Improvement LaboratoryBeltsvilleUSA
  2. 2.College of Animal Science and TechnologyChina Agricultural UniversityBeijingChina
  3. 3.College of AgricultureNingxia UniversityYinchuanChina
  4. 4.Department of BiologyUniversity of BariBariItaly
  5. 5.College of Animal Science and Technology, Shaanxi Key Laboratory of Agricultural Molecular BiologyNorthwest A&F UniversityYanglingChina
  6. 6.Department of Animal and Avian SciencesUniversity of MarylandCollege ParkUSA
  7. 7.The Cell Wall Utilization and Biology LaboratoryUS Dairy Forage Research Center, USDA, ARSMadisonUSA
  8. 8.AIA-LGS, Associazione Italiana Allevatori - Laboratorio Genetica e ServiziCremonaItaly
  9. 9.Parco Tecnologico PadanoVia Einstein, Polo UniversitarioLodiItaly
  10. 10.Laboratory of Animal Cytogenetics and Gene MappingNationa Research Council (CNR), ISPAAMNaplesItaly
  11. 11.RecombineticsSt PaulUSA
  12. 12.Davies Research Centre, School of Animal and Veterinary SciencesUniversity of AdelaideRoseworthyAustralia

Personalised recommendations