Genome Sequencing

  • Michael Kube
  • Bojan Duduk
  • Kenro Oshima


Genome sequences are of major importance to phytoplasma research, as they provide the blueprint for understanding evolution, metabolism and virulence factors of phytoplasmas. Genome projects on these obligate parasites start from metagenomic templates taken from colonised plant- or insect-vector material, meaning that they have to deal with high amounts of untargeted DNA. This problem separates phytoplasmas from the majority of other bacterial genome projects, and methodological approaches deal with it by using strong colonised tissues and enriching phytoplasma DNA. The impact of this situation was severe for the first genome projects using Sanger sequencing, while the most recent phytoplasma genome projects have tried to overcome the problem through huge amounts of reads derived from next-generation sequencing approaches, thus enabling the generation of draft sequences or even complete phytoplasma genomes. Genomic sequence determination is hampered by their repeat-rich content, resulting in conflicts during the sequence assemblies in addition. An overview is provided of the strategies applied to phytoplasma genome sequencing and data processing, as well as currently available data on these particular bacteria.


NGS Complete genomes Draft sequences Genome instability 


  1. Al-Okaily AA (2016) HGA: de novo genome assembly method for bacterial genomes using high coverage short sequencing reads. BMC Genomics 17, 193.PubMedPubMedCentralCrossRefGoogle Scholar
  2. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Research 25, 3389–3402.PubMedPubMedCentralCrossRefGoogle Scholar
  3. Andersen MT, Liefting LW, Havukkala I, Beever RE (2013) Comparison of the complete genome sequence of two closely related isolates of ‘Candidatus Phytoplasma australiense’ reveals genome plasticity. BMC Genomics 14, 529.PubMedPubMedCentralCrossRefGoogle Scholar
  4. Arashida R, Kakizawa S, Hoshi A, Ishii Y, Jung H-Y, Kagiwada S, Yamaji Y, Oshima K, Namba S (2008) Heterogeneic dynamics of the structures of multiple gene clusters in two pathogenetically different lines originating from the same phytoplasma. DNA and Cell Biology 27, 209–217.PubMedCrossRefPubMedCentralGoogle Scholar
  5. Bai X, Zhang J, Ewing A, Miller SA, Jancso Radek A, Shevchenko DV, Tsukerman K, Walunas T, Lapidus A, Campbell JW, Hogenhout SA (2006) Living with genome instability: the adaptation of phytoplasmas to diverse environments of their insect and plant hosts. Journal of Bacteriology 188, 3682–3696.PubMedPubMedCentralCrossRefGoogle Scholar
  6. Barre A, De Daruvar A, Blanchard A (2004) MolliGen, a database dedicated to the comparative genomics of Mollicutes. Nucleic Acids Research 32, D307–D310.PubMedPubMedCentralCrossRefGoogle Scholar
  7. Bendtsen JD, Nielsen H, Von Heijne G, Brunak S (2004) Improved prediction of signal peptides: signalP 3.0. Journal of Molecular Biology 340, 783–95.CrossRefGoogle Scholar
  8. Bennett GM, Abba S, Kube M, Marzachì C (2016) Complete genome sequences of the obligate symbionts ‘Candidatus Sulcia muelleri’ and ‘Ca. Nasuia deltocephalinicola’ from the pestiferous leafhopper Macrosteles quadripunctulatus (Hemiptera: Cicadellidae). Genome Announcements 4, e01604–15.Google Scholar
  9. Bodenteich A, Chissoe S, Wang YF, Roe BA (1994) Shotgun cloning as the strategy of choice to generate templates for high-throughput dideoxynucleotide sequencing. In: Automated DNA Sequencing and Analysis Techniques. Eds Adams M, Fields C, Venter JC, Academic Press, San Diego, California United States of America.CrossRefGoogle Scholar
  10. Bonfield JK, Smith K, Staden R (1995) A new DNA sequence assembly program. Nucleic Acids Research 23, 4992–4999.PubMedPubMedCentralCrossRefGoogle Scholar
  11. Carver T, Berriman M, Tivey A, Patel C, Bohme U, Barrell BG, Parkhill J, Rajandream MA (2008) Artemis and ACT: viewing, annotating and comparing sequences stored in a relational database. Bioinformatics 24, 2672–2676.PubMedPubMedCentralCrossRefGoogle Scholar
  12. Caspi R, Billington R, Fulcher CA, Keseler IM, Kothari, Krummenacker M, Latendresse M, Midford PE, Ong Q, Ong WK, Paley S, Subhraveti P, Karp PD (2018) The MetaCyc database of metabolic pathways and enzymes. Nucleic Acids Research 46, D633–D639.PubMedCrossRefPubMedCentralGoogle Scholar
  13. Chang SH, Cho ST, Chen CL, Yang JY, Kuo CH (2015) Draft genome sequence of a 16SrII-A Subgroup phytoplasma associated with purple coneflower (Echinacea purpurea) witches’ broom disease in Taiwan. Genome Announcements 3, e01398–15.Google Scholar
  14. Chen W, Li Y, Wang Q, Wang N, Wu Y (2014) Comparative genome analysis of wheat blue dwarf phytoplasma, an obligate pathogen that causes wheat blue dwarf disease in China. Plos One 9, e96436.PubMedPubMedCentralCrossRefGoogle Scholar
  15. Chung WC, Chen LL, Lo WS, Lin CP, Kuo CH (2013) Comparative analysis of the peanut witches’ broom phytoplasma genome reveals horizontal transfer of potential mobile units and effectors. Plos One 8, e62770.PubMedPubMedCentralCrossRefGoogle Scholar
  16. Delcher AL, Harmon D, Kasif S, White O, Salzberg SL (1999a) Improved microbial gene identification with GLIMMER. Nucleic Acids Research 27, 4636–4641.PubMedPubMedCentralCrossRefGoogle Scholar
  17. Delcher AL, Kasif S, Fleischmann RD, Peterson J, White O, Salzberg SL (1999b) Alignment of whole genomes. Nucleic Acids Research 27, 2369–2376.PubMedPubMedCentralCrossRefGoogle Scholar
  18. Ewing B, Hillier L, Wendl MC, Green P (1998) Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Research 8, 175–185.PubMedCrossRefPubMedCentralGoogle Scholar
  19. Finn RD, Coggill P, Eberhardt RY, Eddy SR, Mistry J, Mitchell AL, Potter SC, Punta M, Qureshi M, Sangrador-Vegas A, Salazar GA, Tate J, Bateman A (2016) The Pfam protein families database: towards a more sustainable future. Nucleic Acids Research 44, D279–D285.PubMedCrossRefPubMedCentralGoogle Scholar
  20. Fischer A, Santana-Cruz I, Wambua L, Olds C, Midega C, Dickinson M, Kawicha P, Khan Z, Masiga D, Jores J, Schneider B (2016) Draft genome sequence of ‘Candidatus Phytoplasma oryzae’ strain Mbita1, the causative agent of napier grass stunt disease in Kenya. Genome Announcements 4, e00297–16.Google Scholar
  21. Frangeul L, Nelson KE, Buchrieser C, Danchin A, Glaser P, Kunst F (1999) Cloning and assembly strategies in microbial genome projects. Microbiology 145, 2625–2634.PubMedCrossRefGoogle Scholar
  22. Fraser CM, Gocayne JD, White O, Adams MD, Clayton RA, Fleischmann RD, Bult CJ, Kerlavage AR, Sutton G, Kelley JM, Fritchman RD, Weidman JF, Small KV, Sandusky M, Fuhrmann J, Nguyen D, Utterback TR, Saudek DM, Phillips CA, Merrick JM, Tomb JF, Dougherty BA, Bott KF, Hu PC, Lucier TS, Peterson SN, Smith HO, Hutchison 3th CA, Venter JC (1995) The minimal gene complement of Mycoplasma genitalium. Science 270, 397–403.PubMedCrossRefGoogle Scholar
  23. Gordon D, Abajian C, Green P (1998) Consed: a graphical tool for sequence finishing. Genome Research 8, 195–202.PubMedCrossRefGoogle Scholar
  24. Grigoriev A (1998) Analyzing genomes with cumulative skew diagrams. Nucleic Acids Research 26, 2286–2290.PubMedPubMedCentralCrossRefGoogle Scholar
  25. Hicks CA, Barker EN, Brady C, Stokes CR, Helps CR, Tasker S (2014) Non-ribosomal phylogenetic exploration of Mollicute species: new insights into haemoplasma taxonomy. Infectious Genetic Evolution 23, 99–105.CrossRefGoogle Scholar
  26. Hunter S, Apweiler R, Attwood TK, Bairoch A, Bateman A, Binns D, Bork P, Das U, Daugherty L, Duquenne L, Finn RD, Gough J, Haft D, Hulo N, Kahn D, Kelly E, Laugraud A, Letunic I, Lonsdale D, Lopez R, Madera M, Maslen J, McAnulla C, McDowall J, Mistry J, Mitchell A, Mulder N, Natale D, Orengo C, Quinn AF, Selengut JD, Sigrist CJ, Thimma M, Thomas PD, Valentin F, Wilson D, Wu CH, Yeats C (2009) InterPro: the integrative protein signature database. Nucleic Acids Research 37, D211–D215.PubMedCrossRefGoogle Scholar
  27. Hyatt D, Chen GL, Locascio PF, Land ML, Larimer FW, Hauser LJ (2010) Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11, 119.PubMedPubMedCentralCrossRefGoogle Scholar
  28. IRPCM (2004) ‘Candidatus Phytoplasma’, a taxon for the wall-less, non-helical prokaryotes that colonize plant phloem and insects. International Journal of Systematic and Evolutionary Microbiology 54, 1243–1255.Google Scholar
  29. Jomantiene R, Davis RE (2006) Clusters of diverse genes existing as multiple, sequence-variable mosaics in a phytoplasma genome. FEMS Microbiology Letters 255, 59–65.PubMedCrossRefGoogle Scholar
  30. Jomantiene R, Zhao Y, Davis RE (2007) Sequence-variable mosaics: composites of recurrent transposition characterizing the genomes of phylogenetically diverse phytoplasmas. DNA and Cell Biology 26, 557–564.PubMedCrossRefPubMedCentralGoogle Scholar
  31. Kakizawa S, Makino A, Ishii Y, Tamaki H, Kamagata Y (2014) Draft genome sequence of ‘Candidatus Phytoplasma asteris’ strain OY-V, an unculturable plant-pathogenic bacterium. Genome Announcements 2, e00944–14Google Scholar
  32. Kall L, Krogh A, Sonnhammer EL (2004) A combined transmembrane topology and signal peptide prediction method. Journal of Molecular Biology 338, 1027–1036.PubMedCrossRefGoogle Scholar
  33. Kanehisa M, Goto S (2000) KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Research 28, 27–30.PubMedPubMedCentralCrossRefGoogle Scholar
  34. Kim J, Lindsey RL, Garcia-Toledo L, Loparev VN, Rowe LA, Batra D, Juieng P, Stoneburg D, Martin H, Knipe K, Smith P, Strockbine N (2018) High-quality whole-genome sequences for 59 historical Shigella strains generated with PacBio sequencing. Genome Announcements 6, e00282–18.PubMedPubMedCentralCrossRefGoogle Scholar
  35. Kirkpatrick BC, Stenger DC, Morris J, Purcell AH (1987) Cloning and detection of DNA from a nonculturable plant pathogenic mycoplasma-like organism. Science 238, 197–200.PubMedCrossRefPubMedCentralGoogle Scholar
  36. Kube M, Schneider B, Kuhl H, Dandekar T, Heitmann K, Migdoll AM, Reinhardt R, Seemüller E (2008) The linear chromosome of the plant-pathogenic mycoplasma ‘Candidatus Phytoplasma mali’. BMC Genomics 9, 306.PubMedPubMedCentralCrossRefGoogle Scholar
  37. Kube M, Mitrovic J, Duduk B, Rabus R, Seemüller E (2012) Current view on phytoplasma genomes and encoded metabolism. Scientific World Journal 2012, 185942.Google Scholar
  38. Kumar S, Stecher G, Tamura K (2016) MEGA7: Molecular evolutionary genetics analysis version 7.0 for bigger datasets. Molecular Biology and Evolution 33, 1870–1874.PubMedCrossRefGoogle Scholar
  39. Lagesen K, Hallin P, Rodland EA, Staerfeldt HH, Rognes T, Ussery DW (2007) RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Research 35, 3100–3108.PubMedPubMedCentralCrossRefGoogle Scholar
  40. Lee I-M, Gundersen-Rindal DE, Davis RE, Bartoszyk IM (1998) Revised classification scheme of phytoplasmas based on RFLP analyses of 16S rRNA and ribosomal protein gene sequences. International Journal of Systematic and Evolutionary Microbiology 48, 1153–1169.Google Scholar
  41. Lee I-M, Zhao Y, Bottner KD (2005) Novel insertion sequence-like elements in phytoplasma strains of the aster yellows group are putative new members of the IS3 family. FEMS Microbiology Letters 242, 353–360.PubMedCrossRefPubMedCentralGoogle Scholar
  42. Lee I-M, Shao J, Bottner-Parker KD, Gundersen-Rindal DE, Zhao Y, Davis RE (2015) Draft genome sequence of ‘Candidatus Phytoplasma pruni’ strain CX, a plant-pathogenic bacterium. Genome Announcements 3, e01117–15.Google Scholar
  43. Li L, Stoeckert CJ Jr., Roos DS (2003) OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Research 13, 2178–2189.PubMedPubMedCentralCrossRefGoogle Scholar
  44. Liefting LW, Andersen MT, Lough TJ, Beever RE (2006) Comparative analysis of the plasmids from two isolates of ‘Candidatus Phytoplasma australiense’. Plasmid 56, 138–144.PubMedCrossRefGoogle Scholar
  45. Lowe TM, Eddy SR (1997) tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Research 25, 955–964.PubMedPubMedCentralCrossRefGoogle Scholar
  46. Marcone C, Neimark H, Ragozzino A, Lauer U, Seemüller E (1999) Chromosome sizes of phytoplasmas composing major phylogenetic groups and subgroups. Phytopathology 89, 805–810.PubMedCrossRefPubMedCentralGoogle Scholar
  47. Miller JR, Koren S, Sutton G (2010) Assembly algorithms for next-generation sequencing data. Genomics 95, 315–327.PubMedPubMedCentralCrossRefGoogle Scholar
  48. Mitrovic J, Siewert C, Duduk B, Hecht J, Molling K, Broecker F, Beyerlein P, Buttner C, Bertaccini A, Kube M (2014) Generation and analysis of draft sequences of “stolbur” phytoplasma from multiple displacement amplification templates. Journal of Molecular Microbiological Biotechnology 24, 1–11.PubMedCrossRefPubMedCentralGoogle Scholar
  49. Mitrovic J, Smiljkovic M, Seemüller E, Reinhardt R, Hüttel B, Büttner C, Bertaccini A, Kube M., Duduk B (2015) Differentiation of ‘Candidatus Phytoplasma cynodontis’ based on 16S rRNA and groEL genes and identification of a new subgroup, 16SrXIV-C. Plant Disease 99, 1578–1583.PubMedCrossRefPubMedCentralGoogle Scholar
  50. Muir P, Li S, Lou S, Wang D, Spakowicz DJ, Salichos L, Zhang J, Weinstock GM, Isaacs F, Rozowsky J, Gerstein M (2016) The real cost of sequencing: scaling computation to keep pace with data generation. Genome Biology 17, 53.PubMedPubMedCentralCrossRefGoogle Scholar
  51. Neimark H, Kirkpatrick BC (1993) Isolation and characterization of full-length chromosomes from non-culturable plant-pathogenic mycoplasma-like organisms. Molecular Microbiology 7, 21–28.PubMedCrossRefGoogle Scholar
  52. Orlovskis Z, Canale MC, Haryono M, Lopes JRS, Kuo CH, Hogenhout SA (2017) A few sequence polymorphisms among isolates of maize bushy stunt phytoplasma associate with organ proliferation symptoms of infected maize plants. Annals of Botany 119, 869–884.Google Scholar
  53. Oshima K, Shiomi T, Kuboyama T, Sawayanagi T, Nishigawa H, Kakizawa S, Miyata S, Ugaki M, Namba S (2001) Isolation and characterization of derivative lines of the onion yellows phytoplasma that do not cause stunting or phloem hyperplasia. Phytopathology 91, 1024–1029.CrossRefGoogle Scholar
  54. Oshima K, Kakizawa S, Nishigawa H, Jung H-Y, Wei W, Suzuki S, Arashida R, Nakata D, Miyata S, Ugaki M, Namba S (2004) Reductive evolution suggested from the complete genome sequence of a plant-pathogenic phytoplasma. Nature Genetics 36, 27–29.CrossRefGoogle Scholar
  55. Overbeek R, Olson R, Pusch GD, Olsen GJ, Davis JJ, Disz T, Edwards RA, Gerdes S, Parrello B, Shukla M, Vonstein V, Wattam AR, Xia F, Stevens R (2014) The SEED and the rapid annotation of microbial genomes using subsystems technology (RAST). Nucleic Acids Reseach 42, D206–D214.PubMedPubMedCentralCrossRefGoogle Scholar
  56. Pacifico D, Galetto L, Rashidi M, Abbà S, Palmano S, Firrao G, Bosco D, Marzachì C (2015) Decreasing global transcript levels over time suggest that phytoplasma cells enter stationary phase during plant and insect colonization. Applied and Environmental Microbiology 81, 2591–2602.PubMedPubMedCentralCrossRefGoogle Scholar
  57. Quaglino F, Kube M, Jawhari M, Abou-Jawdah Y, Siewert C, Choueiri E, Sobh H, Casati P, Tedeschi R, Molino Lova M, Alma A, Bianco PA (2015) ‘Candidatus Phytoplasma phoenicium’ associated with almond witches’ broom disease: from draft genome to genetic diversity among strain populations. BMC Microbiology 15, 148.Google Scholar
  58. Razin S, Yogev D, Naot Y (1998) Molecular biology and pathogenicity of mycoplasmas. Microbiology and Molecular Biology Reviews 62, 1094–1156.PubMedPubMedCentralGoogle Scholar
  59. Saccardo F, Martini M, Palmano S, Ermacora P, Scortichini M, Loi N, Firrao G (2012) Genome drafts of four phytoplasma strains of the ribosomal group 16SrIII. Microbiology 158, 2805–2814.PubMedPubMedCentralCrossRefGoogle Scholar
  60. Seruga-Music M, Samarzija I, Hogenhout SA, Haryono M, Cho ST, Kuo CH (2018) The genome of ‘Candidatus Phytoplasma solani’ strain SA-1 is highly dynamic and prone to adopting foreign sequences. Systematic and Applied Microbiology 42, 117–127.CrossRefGoogle Scholar
  61. Sparks ME, Bottner-Parker KD, Gundersen-Rindal DE, Lee I-M (2018) Draft genome sequence of the New Jersey aster yellows strain of ‘Candidatus Phytoplasma asteris’. Plos One 13, e0192379.PubMedPubMedCentralCrossRefGoogle Scholar
  62. Strauss E (2009) Phytoplasma research begins to bloom. Science 325, 388–390.PubMedCrossRefGoogle Scholar
  63. Tatusov RL, Galperin MY, Natale DA, Koonin EV (2000) The COG database: a tool for genome-scale analysis of protein functions and evolution. Nucleic Acids Research 28, 33–36.PubMedPubMedCentralCrossRefGoogle Scholar
  64. Teng JLL, Yeung ML, Chan E, Jia L, Lin GH, Huang Y, Tse H, Wong SSY, Sham PC, Lau SKP, Woo PCY (2017) PacBio but not Illumina technology can achieve fast, accurate and complete closure of the high GC, complex Burkholderia pseudomallei two-chromosome genome. Frontieres in Microbiology 8, 1448.Google Scholar
  65. Toruno TY, Seruga-Music MS, Simi S, Nicolaisen M, Hogenhout SA (2010) Phytoplasma PMU1 exists as linear chromosomal and circular extrachromosomal elements and has enhanced expression in insect vectors compared with plant hosts. Molecular Microbiology 77, 1406–1415.PubMedCrossRefGoogle Scholar
  66. Town JR, Wist T, Perez-Lopez E, Olivier CY, Dumonceaux TJ (2018) Genome sequence of a plant-pathogenic bacterium ‘Candidatus Phytoplasma asteris’ strain TW1. Microbiology Resources Announcements 7, e01109–18.Google Scholar
  67. Tran-Nguyen LT, Kube M, Schneider B, Reinhardt R, Gibb KS (2008) Comparative genome analysis of ‘Candidatus Phytoplasma australiense’ (subgroup tuf-Australia I; rp-A) and ‘Ca. Phytoplasma asteris’ strains OY-M and AY-WB. Journal of Bacteriology 190, 3979–3991.PubMedPubMedCentralCrossRefGoogle Scholar
  68. Wang J, Song L, Jiao Q, Yang S, Gao R, Lu X, Zhou G (2018) Comparative genome analysis of jujube witches’ broom phytoplasma, an obligate pathogen that causes jujube witches’ broom disease. BMC Genomics 19, 689.Google Scholar
  69. Wei W, Davis RE, Jomantiene R, Zhao Y (2008) Ancient, recurrent phage attacks and recombination shaped dynamic sequence-variable mosaics at the root of phytoplasma genome evolution. Proceedings of the National Academy of Sciences United States of America 105, 11827–11832.CrossRefGoogle Scholar
  70. Zamorano A, Fiore N (2016) Draft genome sequence of 16SrIII-J phytoplasma, a plant pathogenic bacterium with a broad spectrum of hosts. Genome Announcements 4, e00602–16.PubMedPubMedCentralCrossRefGoogle Scholar
  71. Zerbino DR (2010) Using the Velvet de novo assembler for short-read sequencing technologies. Current Protocols in Bioinformatics 11, Unit 11 5.Google Scholar
  72. Zhu Y, He Y, Zheng Z, Chen J, Wang Z, Zhou G (2017) Draft genome sequence of rice orange leaf phytoplasma from Guangdong, China. Genome Announcements 5, e00430–17.PubMedPubMedCentralGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  • Michael Kube
    • 1
  • Bojan Duduk
    • 2
  • Kenro Oshima
    • 3
  1. 1.Integrative Infection Biology Crops-LivestockUniversity of HohenheimStuttgartGermany
  2. 2.Institute of Pesticides and Environmental ProtectionBelgradeSerbia
  3. 3.Department of Clinical Plant Science, Faculty of Bioscience and Applied ChemistryHosei UniversityTokyoJapan

Personalised recommendations