, Volume 209, Issue 2, pp 477–493 | Cite as

Balancing selection contributed to domestication of autopolyploid sugarcane (Saccharum officinarum L.)

  • Jie Arro
  • Jong-Won Park
  • Ching Man Wai
  • Robert VanBuren
  • Yong-Bao Pan
  • Chifumi Nagai
  • Jorge da Silva
  • Ray MingEmail author


Sugarcane was domesticated in New Guinea about 10,000 years ago, and the domesticated species Saccharum officinarum is autooctoploid. For diploid domesticated crops, genetic diversity is reduced for those genes controlling favorable traits. However, the domestication traits in sugarcane such as sugar content and biomass yield are controlled by multiple genes with multiple alleles. A genomics approach to identify genes involved in the transition from wild to domesticated may provide useful insight into complex polyploid traits such as sucrose accumulation. Fifteen accessions each of domesticated S. officinarum and the wild species S. robustum and eighteen accessions of S. spontaneum were used for sequencing of leaf and stalk transcriptomes. We found high allelic diversity among genes expressed in the stalk tissues where the domesticated S. officinarum (Fis = 0.69) surprisingly has higher allele diversity than its wild relative S. spontaneum (Fis = 0.41). However, there were no SNP loci with extremely high FST values despite the observed higher average FST among the three species, indicative of the action of balancing selection. This is corroborated by nucleotide diversity and site frequency spectrum (SFS) patterns that show that majority of expressed genes in S. officinarum have comparable per-site heterozygosity to the wild species. These candidate domestication genes, bearing signatures of balancing selection and excess singleton SNPs in S. officinarum, perturbs pathways involved in sucrose and starch metabolism.


Domestication Polyploid Population genetics Selection scan Sugarcane 



This project was supported by grants from the International Consortium for Sugarcane Biotechnology, EBI BP2012OO2J17, the Texas Governor’s Office Emerging Technology Funds, Bioenergy, and US DOE DE-SC0010686.

Author Contributions

RM conceived the study; RM, JS and CN designed the study; JA and JWP conducted the lab work; CMW, RV and YBP contributed to the analyses and manuscript revisions; JA analyzed the data and wrote the manuscript.

Compliance with ethical standards

Conflict of interest

The authors declare no conflict of interest.

Informed consent

Informed consent was obtained from all individual participants included in the study.

Supplementary material

10681_2016_1672_MOESM1_ESM.pdf (1.5 mb)
Supplementary material 1 (PDF 1546 kb)


  1. Aitken K, McNeil M, Henry R, Kole C (2010) Diversity analysis. In: Henry R, Kole C (eds) Genet. Genomics Breed. Sugarcane. Science Publishers, Inc, Enfield, N.H., pp 19–42Google Scholar
  2. Alwala S, Kimbeng CA, Veremis JC, Gravois KA (2007) Linkage mapping and genome analysis in a Saccharum interspecific cross using AFLP, SRAP and TRAP markers. Euphytica 164:37–51. doi: 10.1007/s10681-007-9634-9 CrossRefGoogle Scholar
  3. Andrews S (2010) FastQC: a quality control tool for high throughput sequence data. Babraham Bioinforma 1. doi: citeulike-article-id:11583827Google Scholar
  4. Balloux F, Lehmann L, Meeus T (2003) The population genetics of clonal and partially clonal diploids. Genetics 164:1635–1644PubMedPubMedCentralGoogle Scholar
  5. Beaumont Ma, Balding DJ (2004) Identifying adaptive genetic divergence among populations from genome scans. Mol Ecol 13:969–980. doi: 10.1111/j.1365-294X.2004.02125.x CrossRefPubMedGoogle Scholar
  6. Bouchet S, Pot D, Deu M et al (2012) Genetic structure, linkage disequilibrium and signature of selection in Sorghum: lessons from physically anchored DArT markers. PLoS ONE 7:e33470. doi: 10.1371/journal.pone.0033470 CrossRefPubMedPubMedCentralGoogle Scholar
  7. Casa AM, Mitchell SE, Hamblin MT et al (2005) Diversity and selection in Sorghum: simultaneous analyses using simple sequence repeats. Theor Appl Genet 111:23–30. doi: 10.1007/s00122-005-1952-5 CrossRefPubMedGoogle Scholar
  8. Chapman MA, Pashley CH, Wenzler J et al (2008) A genomic scan for selection reveals candidates for genes involved in the evolution of cultivated sunflower (Helianthus annuus). Plant Cell 20:2931–2945. doi: 10.1105/tpc.108.059808 CrossRefPubMedPubMedCentralGoogle Scholar
  9. Cordeiro GM, Eliott F, McIntyre CL et al (2006) Characterisation of single nucleotide polymorphisms in sugarcane ESTs. Theor Appl Genet 113:331–343. doi: 10.1007/s00122-006-0300-8 CrossRefPubMedGoogle Scholar
  10. Cordeiro G, Amouyal O, Eliott F, Henry R (2007) Sugarcane. In: Kole C (ed) Genome Mapp Mol. Breed. Spring, Heidelberg, pp 175–203Google Scholar
  11. Da Silva JAG, Sorrells ME (1996) Linkage Analysis in Polyploids using Molecular Markers. In: Jauhar P (ed) Methods of Genome Analysis in Plants. CRC Press, Boca Raton, FL, pp 211–228Google Scholar
  12. Da Silva JAG, Honeycutt RJ, Burnquist WL, Al-Janabi SM, Sorrells ME, Tanksley SD, Sobral BWS (1995) Saccharum spontaneum L. ‘SES 208’ genetic linkage map combining RFLP- and PCR-based markers. Mol Breed 1(2):165–179CrossRefGoogle Scholar
  13. Danecek P, Auton A, Abecasis G et al (2011) The variant call format and VCFtools. Bioinformatics 27:2156–2158. doi: 10.1093/bioinformatics/btr330 CrossRefPubMedPubMedCentralGoogle Scholar
  14. De Wit P, Pespeni MH, Ladner JT et al (2012) The simple fool’s guide to population genomics via RNA-Seq: an introduction to high-throughput sequencing data analysis. Mol Ecol Resour 12:1058–1067. doi: 10.1111/1755-0998.12003 CrossRefPubMedGoogle Scholar
  15. Dubcovsky J, Dvorak J (2007) Genome plasticity a key factor in the success of polyploid wheat under domestication. Science 316:1862–1866. doi: 10.1126/science.1143986 CrossRefPubMedPubMedCentralGoogle Scholar
  16. Ferrer A, Tarazona S, Garcı F, Conesa A (2011) Differential expression in RNA-seq: a matter of depth. Genome Res 21:2213–2223. doi: 10.1101/gr.124321.111.Freely CrossRefPubMedPubMedCentralGoogle Scholar
  17. Fischer MC, Rellstab C, Tedder A et al (2013) Population genomic footprints of selection and associations with climate in natural populations of Arabidopsis halleri from the Alps. Mol Ecol 22:5594–5607. doi: 10.1111/mec.12521 CrossRefPubMedPubMedCentralGoogle Scholar
  18. Foll M, Gaggiotti O (2008) A genome-scan method to identify selected loci appropriate for both dominant and codominant markers: a bayesian perspective. Genetics 180:977–993. doi: 10.1534/genetics.108.092221 CrossRefPubMedPubMedCentralGoogle Scholar
  19. Fregene MA, Suarez M, Mkumbira J et al (2003) Simple sequence repeat marker diversity in cassava landraces: genetic diversity and differentiation in an asexually propagated crop. Theor Appl Genet 107:1083–1093. doi: 10.1007/s00122-003-1348-3 CrossRefPubMedGoogle Scholar
  20. Garcia AAF, Kido EA, Meza AN, Souza HMB, Pinto LR, Pastina MM, Leite CS, da Silva JAG, Ulian EC, Figueira A, Souza AP (2006) Development of an integrated genetic map of a sugarcane (Saccharum spp.) commercial cross, based on a maximum-likelihood approach for estimation of linkage and linkage phases. Theor Appl Genet 112:298–314CrossRefPubMedGoogle Scholar
  21. García-Alcalde F, Okonechnikov K, Carbonell J et al (2012) Qualimap: evaluating next-generation sequencing alignment data. Bioinformatics 28:2678–2679. doi: 10.1093/bioinformatics/bts503 CrossRefPubMedGoogle Scholar
  22. Goodstein DM, Shu S, Howson R et al (2012) Phytozome: a comparative platform for green plant genomics. Nucleic Acids Res 40:D1178–D1186. doi: 10.1093/nar/gkr944 CrossRefPubMedPubMedCentralGoogle Scholar
  23. Hamblin MT, Casa AM, Sun H et al (2006) Challenges of detecting directional selection after a bottleneck: lessons from Sorghum bicolor. Genetics 173:953–964. doi: 10.1534/genetics.105.054312 CrossRefPubMedPubMedCentralGoogle Scholar
  24. Hohenlohe PA, Phillips PC, Cresko WA (2010) Using population genomics to detect selection in natural populations: key concepts and methodological considerations. Int J Plant Sci 171:1059–1071. doi: 10.1086/656306 CrossRefPubMedPubMedCentralGoogle Scholar
  25. Holsinger KE, Weir BS (2009) Genetics in geographically structured populations: defining, estimating and interpreting F(ST). Nat Rev Genet 10:639–650. doi: 10.1038/nrg2611 CrossRefPubMedPubMedCentralGoogle Scholar
  26. Jakše J, Kindlhofer K, Javornik B (2001) Assessment of genetic variation and differentiation of hop genotypes by microsatellite and AFLP markers. Genome 44:773–782. doi: 10.1139/gen-44-5-773 CrossRefPubMedGoogle Scholar
  27. Jannoo N, Grivet L, Chantret N et al (2007) Orthologous comparison in a gene-rich region among grasses reveals stability in the sugarcane polyploid genome. Cell Mol Biol 50:574–585. doi: 10.1111/j.1365-313X.2007.03082.x Google Scholar
  28. Li H, Handsaker B, Wysoker A et al (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079. doi: 10.1093/bioinformatics/btp352 CrossRefPubMedPubMedCentralGoogle Scholar
  29. Mariette S, Le Corre V, Austerlitz F, Kremer A (2002) Sampling within the genome for measuring within-population diversity: trade-offs between markers. Mol Ecol 11:1145–1156. doi: 10.1046/j.1365-294X.2002.01519.x CrossRefPubMedGoogle Scholar
  30. McIntyre CL, Jackson M, Cordeiro GM et al (2006) The identification and characterisation of alleles of sucrose phosphate synthase gene family III in sugarcane. Mol Breed 18:39–50. doi: 10.1007/s11032-006-9012-7 CrossRefGoogle Scholar
  31. McKenna A, Hanna M, Banks E et al (2010) The genome analysis ToolKit: a map reduce framework for analyzing next-generation DNA sequencing data. Genome Res 20:1297–1303CrossRefPubMedPubMedCentralGoogle Scholar
  32. McKey D, Elias M, Pujol B, Duputié A (2010) The evolutionary ecology of clonally propagated domesticated plants. New Phytol 186:318–332. doi: 10.1111/j.1469-8137.2010.03210.x CrossRefPubMedGoogle Scholar
  33. Ming R, Liu SC, Lin YR et al (1998) Detailed alignment of Saccharum and Sorghum chromosomes: comparative organization of closely related diploid and polyploid genomes. Genetics 150:1663–1682PubMedPubMedCentralGoogle Scholar
  34. Ming R, Liu S-CC, Moore PH et al (2001) QTL analysis in a complex autopolyploid: genetic control of sugar content in Sugarcane. Genome Res 11:2075–2084. doi: 10.1101/gr.198801 CrossRefPubMedPubMedCentralGoogle Scholar
  35. Morin PA, Luikart G, Wayne RK (2004) SNPs in ecology, evolution and conservation. Trends Ecol Evol 19:208–216. doi: 10.1016/j.tree.2004.01.009 CrossRefGoogle Scholar
  36. Nei M (1978) Estimation of average heterozigosity and genetic distance from a small number of individuals. Genetics 89:583–590PubMedPubMedCentralGoogle Scholar
  37. Nei M, Li WH (1979) Mathematical model for studying genetic variation in terms of restriction endonucleases. Proc Natl Acad Sci USA 76:5269–5273. doi: 10.1073/pnas.76.10.5269 CrossRefPubMedPubMedCentralGoogle Scholar
  38. Nielsen R (2005) Molecular signatures of natural selection. Annu Rev Genet 39:197–218. doi: 10.1146/annurev.genet.39.073003.112420 CrossRefPubMedGoogle Scholar
  39. Nielsen R, Hellmann I, Hubisz M et al (2007) Recent and ongoing selection in the human genome. Nat Rev Genet 8:857–868. doi: 10.1038/nrg2187 CrossRefPubMedPubMedCentralGoogle Scholar
  40. Nielsen R, Korneliussen T, Albrechtsen A et al (2012) SNP calling, genotype calling, and sample allele frequency estimation from new-generation sequencing data. PLoS ONE 7:e37558. doi: 10.1371/journal.pone.0037558 CrossRefPubMedPubMedCentralGoogle Scholar
  41. Paterson AH, Bowers JE, Bruggmann R et al (2009) The Sorghum bicolor genome and the diversification of grasses. Nature 457:551–556. doi: 10.1038/nature07723 CrossRefPubMedGoogle Scholar
  42. Patterson N, Price AL, Reich D (2006) Population structure and eigenanalysis. PLoS Genet 2:e190. doi: 10.1371/journal.pgen.0020190 CrossRefPubMedPubMedCentralGoogle Scholar
  43. Price AL, Patterson NJ, Plenge RM et al (2006) Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet 38:904–909. doi: 10.1038/ng1847 CrossRefPubMedGoogle Scholar
  44. Ronen R, Udpa N, Halperin E, Bafna V (2013) Learning natural selection from the site frequency spectrum. Genetics 195:181–193. doi: 10.1534/genetics.113.152587 CrossRefPubMedPubMedCentralGoogle Scholar
  45. Siol M, Wright SI, Barrett SCH (2010) The population genomics of plant adaptation. New Phytol 188:313–332CrossRefPubMedGoogle Scholar
  46. Supek F, Bošnjak M, Škunca N, Šmuc T (2011) Revigo summarizes and visualizes long lists of gene ontology terms. PLoS ONE. doi: 10.1371/journal.pone.0021800 PubMedPubMedCentralGoogle Scholar
  47. Tajima F (1989) Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123:585–595PubMedPubMedCentralGoogle Scholar
  48. Van der Auwera GA, Carneiro MO, Hartl C et al (2002) From FastQ data to high-confidence variant calls: the Genome Analysis Toolkit best practices pipeline. Curr Protoc Bioinforma. doi: 10.1002/0471250953.bi1110s43 Google Scholar
  49. Vigouroux Y, McMullen M, Hittinger CT et al (2002) Identifying genes of agronomic importance in maize by screening microsatellites for evidence of selection during domestication. Proc Natl Acad Sci USA 99:9650–9655. doi: 10.1073/pnas.112324299 CrossRefPubMedPubMedCentralGoogle Scholar
  50. Wang J, Roe B, Macmil S et al (2010) Microcollinearity between autopolyploid sugarcane and diploid sorghum genomes. BMC Genom 11:261–278. doi: 10.1186/1471-2164-11-261 CrossRefGoogle Scholar
  51. Weising K, Nybom H, Wolff K, Kahl G (2005) DNA fingerprinting in plants: principles, methods, and applications. CRC Press, Boca Raton 472CrossRefGoogle Scholar
  52. Wijaya E, Frith MC, Suzuki Y, Horton P (2009) Recount: expectation maximization based error correction tool for next generation sequencing data. Genome Inform 23:189–201PubMedGoogle Scholar
  53. Wright SI, Bi IV, Schroeder SG et al (2005) The effects of artificial selection on the maize genome. Science (80−) 308:1310–1314. doi: 10.1126/science.1107891 CrossRefGoogle Scholar
  54. Wysoker A, Tibbetts K, Fennell T (2013) Picard tools version 1.90.
  55. Yamasaki M, Tenaillon MI, Bi V et al (2005) A large-scale screen for artificial selection in maize identifies candidate agronomic loci for domestication and crop improvement. Plant Cell 17:2859–2872. doi: 10.1105/tpc.105.037242.1 CrossRefPubMedPubMedCentralGoogle Scholar
  56. Zhang J, Arro J, Chen Y, Ming R (2013) Haplotype analysis of sucrose synthase gene family in three Saccharum species. BMC Genom 14:314–325. doi: 10.1186/1471-2164-14-314 CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2016

Authors and Affiliations

  • Jie Arro
    • 1
  • Jong-Won Park
    • 2
  • Ching Man Wai
    • 1
  • Robert VanBuren
    • 1
  • Yong-Bao Pan
    • 3
  • Chifumi Nagai
    • 4
  • Jorge da Silva
    • 5
  • Ray Ming
    • 1
    • 6
    Email author
  1. 1.Department of Plant BiologyUniversity of Illinois at Urbana-ChampaignUrbanaUSA
  2. 2.Citrus CenterTexas A&M University KingsvilleWeslacoUSA
  3. 3.USDA-Agricultural Research ServiceHoumaUSA
  4. 4.Hawaii Agriculture Research CenterKuniaUSA
  5. 5.Texas A&M AgriLife ResearchTexas A&M University SystemWeslacoUSA
  6. 6.FAFU and UIUC-SIB Joint Center for Genomics and BiotechnologyFujian Agriculture and Forestry UniversityFuzhouChina

Personalised recommendations