Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Characterization of a deep-coverage carrot (Daucus carota L.) BAC library and initial analysis of BAC-end sequences


Carrot is the most economically important member of the Apiaceae family and a major source of provitamin A carotenoids in the human diet. However, carrot molecular resources are relatively underdeveloped, hampering a number of genetic studies. Here, we report on the synthesis and characterization of a bacterial artificial chromosome (BAC) library of carrot. The library is 17.3-fold redundant and consists of 92,160 clones with an average insert size of 121 kb. To provide an overview of the composition and organization of the carrot nuclear genome we generated and analyzed 2,696 BAC-end sequences (BES) from nearly 2,000 BACs, totaling 1.74 Mb of BES. This analysis revealed that 14% of the BES consists of known repetitive elements, with transposable elements representing more than 80% of this fraction. Eleven novel carrot repetitive elements were identified, covering 8.5% of the BES. Analysis of microsatellites showed a comparably low frequency for these elements in the carrot BES. Comparisons of the translated BES with protein databases indicated that approximately 10% of the carrot genome represents coding sequences. Moreover, among eight dicot species used for comparison purposes, carrot BES had highest homology to protein-coding sequences from tomato. This deep-coverage library will aid carrot breeding and genetics.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5


  1. Adam-Blondon AF, Bernole A, Faes G, Lamoureux D, Pateyron S, Grando MS, Caboche M, Velasco R, Chalhoub B (2005) Construction and characterization of BAC libraries from major grapevine cultivars. Theor Appl Genet 110:1363–1371

  2. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215:403–410

  3. Ammiraju JSS, Lwo M, Goicochea JL, Wang W, Kudrna D, Mueller C, Talag J, Kim H-R, Sisneros NB, Blackmon B, Fang E, Tomkins JB, Brar D, MacKill D, McCouch S, Kurata N, Lambert G, Galbraith DW, Arumuganathan K, Rao K, Walling JG, Gill N, Yu Y, SanMiguel P, Soderlund C, Jackson S, Wing RA (2006) The Oryra bacterial artificial chromosome library resource: construction and analysis of 12 deep-coverage large-insert BAC libraries that represent the 10 genome types of the genus Oryza. Genome Res 16:140–147

  4. Ansay M, Simon PW (2003) Mapping cytoplasmic male sterility restorer genes in carrot. Plant and animal genomes XI conference, P693 San Diego

  5. Arumuganathan K, Earle ED (1991) Nuclear DNA content of some important plant species. Plant Mol Biol Rep 9(3):208–218

  6. Bach IC, Olesen A, Simon PW (2002) PCR-based markers to differentiate the petaloid and male fertile carrot (Daucus carota L.). Euphytica 127:353–365

  7. Bao Z, Eddy SR (2002) Automated de novo identification of repeat sequence families in sequenced genomes. Genome Res 12:1269–1276

  8. Benson G (1999) Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res 27(2):573–580

  9. Birren B, Green ED, Klapholz S (1997) Genome analysis: a laboratory manual. Analyzing DNA, vol 1. Cold Spring Harbor, New York

  10. Boiteux LS, Fonseca MEN, Simon PW (1999) Effects of plant tissue and DNA purification method on randomly amplified polymorphic DNA-based genetic analysis in carrot. J Am Soc Hortic Sci 124:32–38

  11. Boiteux LS, Belter JG, Roberts PA, Simon PW (2000) RAPD linkage map of the genomic region encompassing the root-knot nematode (Meloidogyne javanica) resistance locus in carrot. Theor Appl Genet 100:439–446

  12. Bradeen JM, Naess SK, Song J, Haberlach GT, Wielgus SM, Buel CR, Jiang J, Helgeson JP (2003) Concomitant reiterative BAC walking and fine genetic mapping enable physical map development for the broad-spectrum late blight resistance region, RB. Mol Genet Genomics 269(5):603–611

  13. Chen M, Presting G, Barbazuk WB, Goicoechea JL, Blackmon B, Fang G, Kim H, Frisch D, Yu Y, Sun S, Higingbottom S, Phimphilai J, Phimphilai D, Thurmond S, Gaudette B, Li P, Liu J, Hat-Weld J, Main D, Farrar K, Henderson C, Barnett L, Costa R, Williams B, Walser S, Atkins M, Hall C, Budiman MA, Tomkins JP, Luo M, Bancroft I, Salse J, Regad F, Mohapatra T, Singh NK, Tyagi AK, Soderlund C, Dean RA, Wing RA (2002) An integrated physical and genetic map of the rice genome. Plant Cell 14:1–10

  14. Cheng Z, Presting G, Buell CR, Wing RA, Jiang J (2001) High-resolution pachytene chromosome mapping of bacterial artificial chromosomes anchored by genetic markers reveals the centromere location and the distribution of genetic recombination along chromosome 10 of rice. Genetics 157:1749–1757

  15. Chou HH, Holmes MH (2001) DNA sequence quality trimming and vector removal. Bioinformatics 17:1093–1094

  16. Datema E, Mueller LA, Buels R, Giovannoni JJ, Visser RGF, Stiekema WJ, van Ham RCHJ (2008) Comparative BAC end sequence analysis of tomato and potato reveals overrepresentation of specific gene families in potato. BMC Plant Biol 8:34–50

  17. Devon RS, Porteous DJ, Brookes AJ (1995) Splinkerettes-improved vectorettes for greater efficiency in PCR walking. Nucleic Acids Res 23(9):1644–1645

  18. Ewing B, Green P (1998) Base-calling of automated sequencer traces using phred II. Error probabilities. Genome Res 8(3):186–194

  19. Ewing B, Hillier L, Wendl MC, Green P (1998) Base-calling of automated sequencer traces using phred I. Accuracy assessment. Genome Res 8(3):175–185

  20. Frelichowski JE, Palmer MB, Main D, Tomkins JP, Cantrell RG, Stelly DM, Yu J, Kohel RJ, Ulloa M (2006) Cotton genome mapping with new microsatellites from Acala ‘Maxxa’ BAC-ends. Mol Genet Genomics 275:479–491

  21. Fuchs J, Kühne M, Schubert I (1998) Assignment of linkage groups to pea chromosomes after karyotyping and gene mapping by fluorescent in situ hybridization. Chromosoma 107(4):272–276

  22. Goff SA, Ricke D, Lan T, Presting G, Wang R, Dunn M, Glazebrook J, Sessions A, Oeller P, Varma H, Hadley D, Hutchison D, Martin C, Katagiri F, Lange B, Moughamer T, Xia Y, Budworth P, Zhong J, Miguel T, Paszkowski U, Zhang S, Colbert M, Sun W, Chen L, Cooper B, Park S, Wood T, Mao L, Quail P, Wing R, Dean R, Yu Y, Zharkikh A, Shen R, Sahasrabudhe S, Thomas A, Cannings R, Gutin A, Pruss D, Reid J, Tavtigian S, Mitchell J, Eldredge G, Scholl T, Miller R, Bhatnagar S, Adey N, Rubano T, Tusneem N, Robinson R, Feldhaus J, Macalma T, Oliphant A, Briggs S (2002) A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). Science 296:92–100

  23. Grzebelus D, Simon PW (2008) Diversity of DcMaster-like elements of the PIF/Harbinger superfamily in the carrot genome. Genetica doi:10.1007/s10709-008-9282-6

  24. Grzebelus D, Yau YY, Simon PW (2006) Master—a novel family of PIF/Harbinger-like transposable elements identified in carrot (Daucus carota L.). Mol Genet Genomics 275:450–459

  25. Grzebelus D, Jagosz B, Simon PW (2007) The DcMaster transposon display maps polymorphic insertion sites in the carrot (Daucus carota L.) genome. Gene 390:67–64

  26. Hamilton CM, Frary A, Xu Y, Tanksley SD, Zhang HB (1999) Construction of tomato genomic DNA libraries in a binary BAC (BIBAC) vector. Plant J 18:223–229

  27. Han Y, Korban SS (2008) An overview of the apple genome through BAC end sequence analysis. Plant Mol Biol 67:581–588

  28. Hardegger M, Sturm A (1998) Transformation and regeneration of carrot (Daucus carota L.). Mol Breed 4:119–127

  29. Holligan D, Zhang X, Jiang N, Pritham EJ, Wessler SR (2006) The transposable element landscape of the model legume Lotus japonicus. Genetics 174:2215–2228

  30. Hong CP, Lee SJ, Park JY, Plaha P, Park YS, Lee YK, Choi JE, Kim KY, Lee JH, Lee J, Jin H, Choi SR, Lim YP (2004) Construction of a BAC library of Korean ginseng and initial analysis of BAC end sequences. Mol Genet Genomics 271:709–716

  31. Hong CP, Plaha P, Koo DH, Yang TJ, Choi SR, Lee YK, Hhm T, Bang JW, Edwards D, Bancroft I, Park BS, Lee J, Lim YP (2006) A survey of the Brassica rapa genome by BAC-end sequence analysis and comparison with Arabidopsis thaliana. Mol Cells 22(3):300–307

  32. Huang X, Madan A (1999) CAP3: a DNA sequence assembly program. Genome Res 9:868–877

  33. Huo N, Lazo GR, Vogel JP, You FM, Ma Y, Hayden DM, Coleman-Derr D, Hill TA, Dvorak J, Anderson OD, Luo MC, Gu YQ (2008) The nuclear genome of Brachypodium distachyon: analysis of BAC end sequences. Funct Integr Genomics 8:135–147

  34. Ilic K, SanMiguel PJ, Bennetzen JL (2003) A complex history of rearrangements in an orthologous region of the maize, sorghum, and rice genomes. Proc Natl Acad Sci USA 100:12265–12270

  35. International Rice Genome Sequencing Project (2005) The map-based sequence of the rice genome. Nature 436:793–800

  36. Just BJ (2004) Genetic mapping of carotenoid pathway structural genes and major gene QTLs for carotenoid accumulation in wild and domesticated carrot (Daucus carota L). Ph.D. thesis, plant breeding and plant genetics, University of Wisconsin, USA

  37. Just BJ, Santos CAF, Fonseca MEN, Boiteux LS, Oloizia BB, Simon PW (2007) Carotenoid biosynthesis structural genes in carrot (Daucus carota): isolation, sequence-characterization, single nucleotide polymorphism (SNP) markers and genome mapping. Theor Appl Genet 114:693–704

  38. Kim UJ, Birren BW, Slepak T, Mancino V, Boysen C, Kang HL, Simon MI, Shizuya H (1996) Construction and characterization of a human bacterial artificial chromosome library. Genomics 34:213–218

  39. Kim HR, San Miguel P, Nelson W, Collura K, Wissotski M, Walling JG, Kim JP, Jackson SA, Soderlund C, Wing RA (2007) Comparative physical mapping between Oryza sativa (AA genome type) and O. punctata (BB genome type). Genetics 176:379–390

  40. Kohany O, Gentles AJ, Hankus L, Jurka J (2006) Annotation, submission and screening of repetitive elements in Repbase: RepbaseSubmitter and censor. BMC Bioinformatics 7:474–480

  41. Lai CWJ, Yu Q, Hou S, Skelton RL, Jones MR, Lewis KLT, Murray J, Eustice M, Guan P, Agbayani R, Moore PH, Ming R, Presting GG (2006) Analysis of papaya BAC end sequences reveals first insights into the organization of a fruit tree genome. Mol Genet Genomics 276:1–12

  42. Leroy T, Marraccini P, Dufour M, Montagnon C, Lashermes P, Sabau X, Ferreira PL, Jourdan I, Pot D, Andrade AC, Glaszmann JC, Vieira LGE, Piffanelli P (2005) Construction and characterization of a Coffea canephora BAC library to study the organization of sucrose biosynthesis genes. Theor Appl Genet 111:1032–1041

  43. Mao L, Wood TC, Yu Y, Budiman MA, Tomkins J, Woo S, Sasinowski M, Presting G, Frisch D, Goff S, Dean RA, Wing RA (2000) Rice transposable elements: a survey of 73,000 sequence-tagged-connectors. Genome Res 10:982–990

  44. Messing J, Bharti AK, Karlowski WM, Gundlach H, Kim HR, Yu Y, Wei F, Fuks G, Suderlund CA, Mayer KFX, Wing RA (2004) Sequence composition and genome organization of maize. Proc Natl Acad Sci USA 101:14349–14354

  45. Metzgar D, Liu L, Hansen C, Dybvig K, Wills C (2002) Domain-level differences in microsatellite distribution and content result from different relative rates of insertion and deletion mutations. Genome Res 12:408–413

  46. Monna L, Kitazawa N, Yoshino R, Suzuki J, Masuda H, Maehara Y, Tanji M, Sato M, Nasu S, Minobe I (2002) Positional cloning of rice semidwarfing gene, sd-1: rice “green revolution gene” encodes a mutant enzyme involved in gibberellin synthesis. DNA Res 9:11–17

  47. Morgante M, Hanfey M, Powell W (2002) Microsatellites are preferentially associated with nonrepetitive DNA in plant genomes. Nature 30:194–200

  48. O’Neill CM, Bancroft I (2000) Comparative physical mapping of segments of the genome of Brassica oleracea var. alboglabra that are homoeologous to sequenced regions of chromosomes 4 and 5 of Arabidopsis thaliana. Plant J 23:233–243

  49. Osoegawa K, Vessere GM, Shu CL, Hoskins RA, Abad JP, de Pablos B, Villasante A, de Jong PJ (2007) BAC clones generated from sheared DNA. Genomics 89:291–299

  50. Ouyang S, Buell CR (2004) The TIGR plant repeat databases: a collective resource for the identification of repetitive sequences in plants. Nucleic Acids Res 32:D360–D363

  51. Rice Chromosome 10 Sequencing Consortium (2003) In-depth view of structure, activity and evolution of rice chromosome 10. Science 300:1566–1569

  52. Robison M, Wolyn DJ (2002) Complex organization of the mitochondrial genome of petaloid CMS carrot. Mol Genet Genomics 268:232–239

  53. Rubatzky VE, Quiros CF, Simon PW (1999) Carrots and related Umbelliferae. CABI Publishing. New York, p 294

  54. Ruhlman T, Lee SB, Jansen RK, Hostetler JB, Tallon LJ, Town CD, Daniell H (2006) Complete plastid genome sequence of Daucus carota: implications for biotechnology and phylogeny of angiosperms. BMC Genomics 7:222–235

  55. Šafár J, Noa-Carrazana JC, Vrána J, Bartoš J, Alkhimova O, Sabau X, Šimková H, Lheureux F, Caruana ML, Dolezel J, Piffanelli P (2004) Creation of a BAC resource to study the structure and evolution of the banana (Musa balbisiana) genome. Genome 47:1182–1191

  56. Santos CAF (2001) Biometrical studies and quantitative trait loci associated with major products of the carotenoid pathway of carrot (Daucus carota L.). Ph.D. thesis, University of Wisconsin, Madison

  57. Santos CAF, Simon PW (2002) QTL analyses reveal clustered loci for accumulation of provitamin A carotenes and lycopene in carrot roots. Mol Genet Genomics 268:122–129

  58. Sardesai VM (1998) Introduction to clinical nutrition. Marcel Dekker, New York

  59. Shen B, Wang DM, McIntyre CL, Liu CJ (2005) A “Chinese spring” wheat (Triticum aestivum L.) bacterial artificial chromosome library and its use in the isolation of SSR markers for targeted genome regions. Theor Appl Genet 111:1489–1494

  60. Simon PW (1992) Genetic improvement of vegetable carotene content. In: Proceedings of 3rd international symposium. Biotech and Nutrition. Butterworth-Heinemann, pp 291–300

  61. Simon PW (2000) Domestication, historical development, and modern breeding of carrot. Plant Breed Rev 19:157–190

  62. Simon PW, Matthews WC, Roberts WA (2000) Evidence for simply inherited dominant resistance to Meloidogyne javanica in carrot. Theor Appl Genet 100:735–742

  63. Simon PW, Pollak LM, Clevidence BA, Holden JM, Haytowitz DB (2008) Plant breeding for human nutrition. Plant Breed Rev 31:325–392

  64. Temnykh S, DeClerck G, Lukashova A, Lipovich L, Carinhour S, McCouch S (2001) Computational and experimental analysis of microsatellites in rice (Oryza sativa L.): frequency, length, variation, transposon associations, and genetic marker potential. Genome Res 11:1441–1452

  65. The Arabidopsis Genome Initiative (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 48:798–815

  66. The French–Italian Public Consortium for Grapevine Genome Characterization (2007) The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla. Nature 449:463–468

  67. Thiel T, Michalek W, Varshney RK, Graner A (2003) Exploiting EST databases for the development and characterization of gene-derived SSR-markers in barley (Hordeum vulgare L.). Theor Appl Genet 106:411–422

  68. Tomkins JP, Davis G, Main D, Yim Y, Duru N, Musket T, Goicochea JL, Frisch DA, Coe EH Jr, Wing RA (2002) Construction and characterization of a deep-coverage bacterial artificial chromosome library for maize. Crop Sci 42:928–933

  69. Tuskan GA et al (2006) The genome of black cottonwood Populus trichocarpa (Torr. & Gray). Science 313:1596–1604

  70. Venter JC, Smith HO, Hood L (1996) A new strategy for genome sequencing. Nature 381:364–366

  71. Vivek BS, Simon PW (1999) Linkage relationships among molecular markers and storage root traits of carrot (Daucus carota L. ssp. sativus). Theor Appl Genet 99:58–64

  72. Wiedmann RT, Nonneman DJ, Keele JW (2006) Novel porcine repetitive elements. BMC Genomics 7:304–315

  73. Zhu W, Ouyang S, Iovene M, O’Brien K, Vuong H, Jiang J, Buell RC (2008) Analysis of 90 Mb of the potato genome reveals conservation of gene structures and order with tomato but divergence in repetitive sequence composition. BMC Genomics 9:286

Download references

Author information

Correspondence to Philipp W. Simon.

Additional information

Nucleotide sequence data reported are available in the DDBJ/EMBL/GenBank databases under the accession numbers FJ147695–FJ150390.

Communicated by R. Hagemann.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplement 1 (XLS 20 kb)

Supplement 2 (XLS 36 kb)

Supplement 3 (DOC 102 kb)

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Cavagnaro, P.F., Chung, S., Szklarczyk, M. et al. Characterization of a deep-coverage carrot (Daucus carota L.) BAC library and initial analysis of BAC-end sequences. Mol Genet Genomics 281, 273–288 (2009).

Download citation


  • Daucus carota
  • BAC (bacterial artificial chromosome) library
  • BAC-end sequences
  • Transposable elements
  • Microsatellites