The Draft Genome of the MD-2 Pineapple
With the advancement in sequencing technology, it is now possible to decode complex plant genomes with high accuracy. For many years, short-read sequencers were the dominant reads used for assembling genomes until the introduction of third-generation long-read sequencing machines. Long reads are able to extend through complex regions of repeats avoiding erroneous collapse which causes a reduction in the genome assembly size. However, the low accuracy of the long reads is a cause of concern, and this hinders its direct application in de novo assemblies of large genomes. Here, we report on the whole-genome assembly of the MD-2 pineapple using a hybrid sequencing approach. We used the Illumina short reads to correct the systematic errors of the long PacBio reads. The error-corrected long reads were then used to de novo assemble the MD-2 pineapple genome using multiple assembly software and strategies. The most optimal accuracy and contiguity were achieved in the de novo assembly of error-corrected long reads using Celera. The MD-2 pineapple genome achieved a N50 of 153,084 bp with 8448 scaffolds and a total assembly size of 524.07 Mb. In addition, 245 out of the 248 ultra-conserved CEGs were found in the genome, indicating completeness of more than 98%. Furthermore, 87% of the mapped transcripts were identified in the genome with coverages of more than 90%, while another 12% were mapped with coverages of more than 80%. This MD-2 pineapple genome provides a high-quality draft for gene prediction and further downstream applications in pineapple.
KeywordsPineapple Plant genome sequencing Hybrid assembly Sequencing technology Heterozygous genome
We thank Hydayaty Yusoff and the Pineapple Board of Malaysia for the pineapple sample, Caroline Chan from Pacific Biosciences (Asia Pacific) and Dana Chow from TreeCode Sdn Bhd for assistance with the Pacific Biosciences RSII, and Novocraft Sdn. Bhd. for the computing facility used in this project. This project is funded by the Ministry of Education and the Ministry of Science, Technology and Innovation, Malaysia, through the Fundamental Research Grant Scheme (FRG0319-SG-2013) and Science Fund (SCF0087-BIO-2013), respectively.
- Hercus C (2015) novoLR package. In: Novocraft Technologies Sdn. Bhd. Kuala Lumpur, MalaysiaGoogle Scholar