Neem Genome Assembly

  • Nagesh A. Kuravadi
  • Malali GowdaEmail author
Part of the Compendium of Plant Genomes book series (CPG)


Genome assembly refers to the process of taking a large number of short DNA sequences and putting them back together to create a representation of the original chromosomes from which the DNA originated. For assembling neem genome, we filtered the NGS raw reads from Illumina to get high-quality reads followed by assembly using Velvet assembler. Similar process was also followed to assemble Roche/454 reads using MIRA. The individual assembly from velvet and MIRA was merged using cd-hit to get the final assembly of neem genome. Transcriptome of neem was assembled from short-read data from different tissues using Trinity assembler. The reads for chloroplast and mitochondria were extracted separately by mapping genome reads to chloroplast and mitochondrial genomes of known plants. The reads were assembled separately using Velvet.


Shotgun sequencing De novo Genome assembly MIRA Velvet 



We acknowledge Genomics facility (BT/PR3481/INF/22/140/2011) at Centre for Cellular and Molecular Platforms, Bangalore for sequencing of Neem genomes. We acknowledge Pradeep H, Aarati Karaba, Manojkumar S and Annapurna for their help in NGS library preparation and sequencing. We thank Ashmita G and Divya S for their help in manual curation of SSR markers. We are grateful to Rajanna, National Botanical Garden, University of Agricultural Sciences, GKVK campus, Bangalore for his help during neem sample collection.


  1. Chevreux B (2005) MIRA: an automated genome and EST assembler. Ruprecht-Karls University, Heidelberg, GermanyGoogle Scholar
  2. Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L, Raychowdhury R, Zeng Q (2011) Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol 29:644–652CrossRefGoogle Scholar
  3. Henschel R, Lieber M, Wu LS, Nista PM, Haas BJ, LeDuc RD (2012) Trinity RNA-Seq assembler performance optimization. In: Proceedings of the 1st conference of the extreme science and engineering discovery environment: bridging from the eXtreme to the campus and beyond, p 45Google Scholar
  4. Kuravadi NA, Yenagi V, Rangiah K, Mahesh HB, Rajamani A, Shirke MD, Russiachand H, Loganathan RM, Lingu CS, Siddappa S (2015) Comprehensive analyses of genomes, transcriptomes and metabolites of neem tree. PeerJ 3:e1066CrossRefGoogle Scholar
  5. Langmead B, Salzberg SL (2012) Fast gapped-read alignment with Bowtie 2. Nat Methods 9:357–359CrossRefGoogle Scholar
  6. Li W, Godzik A (2006) Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22:1658–1659CrossRefGoogle Scholar
  7. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, S. Genome Project Data Processing (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079CrossRefGoogle Scholar
  8. Parra G, Bradnam K, Korf I (2007) CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes. Bioinformatics 23(9):1061–1067CrossRefGoogle Scholar
  9. Zerbino DR, Birney E (2008) Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res 18:821–829CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Centre for Cellular and Molecular Platforms, National Centre for Biological SciencesBengaluruIndia
  2. 2.Center for Functional Genomics and Bio-Informatics, The University of TransDisciplinary and Health SciencesBengaluruIndia

Personalised recommendations