Advertisement

GPU-Accelerated Bidirected De Bruijn Graph Construction for Genome Assembly

  • Mian Lu
  • Qiong Luo
  • Bingqiang Wang
  • Junkai Wu
  • Jiuxin Zhao
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7808)

Abstract

De Bruijn graph construction is a basic component in de novo genome assembly for short reads generated from the second-generation sequencing machines. As this component processes a large amount of data and performs intensive computation, we propose to use the GPU (Graphics Processing Unit) for acceleration. Specifically, we propose a staged algorithm to utilize the GPU for computation over large data sets that do not fit into the GPU memory. We also pipeline the I/O, GPU, and CPU processing to further improve the overall performance. Our preliminary results show that our GPU-accelerated graph construction on an NVIDIA S1070 server achieves a speedup of around two times over previous performance results on a 1024-node IBM Blue Gene/L.

Keywords

Graphic Processing Unit Genome Assembly Main Memory Graph Construction Chunk Size 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Jackson, B., Regennitter, M., Yang, X., Schnable, P., Aluru, S.: Parallel de novo assembly of large genomes from high-throughput short reads. In: IPDPS 2010: Proceedings of the 2010 IEEE International Symposium on Parallel&Distributed Processing, pp. 1–10 (April 2010)Google Scholar
  2. 2.
    Li, R., Zhu, H., Ruan, J., Qian, W., Fang, X., Shi, Z., Li, Y., Li, S., Shan, G., Kristiansen, K., Li, S., Yang, H., Wang, J., Wang, J.: De novo assembly of human genomes with massively parallel short read sequencing. Genome Research 20(2), 265–272 (2010)CrossRefGoogle Scholar
  3. 3.
    Simpson, J.T., Wong, K., Jackman, S.D., Schein, J.E., Jones, S.J., Birol, I.: Abyss: a parallel assembler for short read sequence data. Genome Research 19(6), 1117–1123 (2009)CrossRefGoogle Scholar
  4. 4.
    Zerbino, D.R., Birney, E.: Velvet: algorithms for de novo short read assembly using de bruijn graphs. Genome Research 18(5), 821–829 (2008)CrossRefGoogle Scholar
  5. 5.
    Pevzner, P.A., Tang, H.: Fragment assembly with double-barreled data. Bioinformatics 17(suppl. 1), S225–S233 (2001)Google Scholar
  6. 6.
    Medvedev, P., Georgiou, K., Myers, G., Brudno, M.: Computability of models for sequence assembly. In: Giancarlo, R., Hannenhalli, S. (eds.) WABI 2007. LNCS (LNBI), vol. 4645, pp. 289–301. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  7. 7.
    Chaisson, M.J., Pevzner, P.A.: Short read fragment assembly of bacterial genomes. Genome Research 18(2), 324–330 (2008)CrossRefGoogle Scholar
  8. 8.
    Hossain, M.S.S., Azimi, N., Skiena, S.: Crystallizing short-read assemblies around seeds. BMC Bioinformatics 10(suppl. 1) (2009)Google Scholar
  9. 9.
    Hernandez, D., François, P., Farinelli, L., Østerås, M., Schrenzel, J.: De novo bacterial genome sequencing: Millions of very short reads assembled on a desktop computer. Genome Research 18(5), 802–809 (2008)CrossRefGoogle Scholar
  10. 10.
    Butler, J., MacCallum, I., Kleber, M., Shlyakhter, I.A., Belmonte, M.K., Lander, E.S., Nusbaum, C., Jaffe, D.B.: Allpaths: De novo assembly of whole-genome shotgun microreads. Genome Research 18(5), 810–820 (2008)CrossRefGoogle Scholar
  11. 11.
    Warren, R.L., Sutton, G.G., Jones, S.J., Holt, R.A.: Assembling millions of short dna sequences using ssake. Bioinformatics 23(4), 500–501 (2007)CrossRefGoogle Scholar
  12. 12.
    Dohm, J.C., Lottaz, C., Borodina, T., Himmelbauer, H.: Sharcgs, a fast and highly accurate short-read assembly algorithm for de novo genomic sequencing. Genome Research 17(11), 1697–1706 (2007)CrossRefGoogle Scholar
  13. 13.
    Jackson, B.G., Aluru, S.: Parallel construction of bidirected string graphs for genome assembly. In: International Conference on Parallel Processing, pp. 346–353 (2008)Google Scholar
  14. 14.
    Kundeti, V., Rajasekaran, S., Dinh, H.: Efficient parallel and out of core algorithms for constructing large bi-directed de bruijn graphs. CoRR abs/1003.1940 (2010)Google Scholar
  15. 15.
    Mahmood, S.F., Rangwala, H.: Gpu-euler: Sequence assembly using gpgpu. In: Proceedings of the 2011 IEEE International Conference on High Performance Computing and Communications, HPCC 2011, pp. 153–160. IEEE Computer Society (2011)Google Scholar
  16. 16.
    National Center for Biotechnology Information, http://www.ncbi.nlm.nih.gov/

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Mian Lu
    • 1
  • Qiong Luo
    • 2
  • Bingqiang Wang
    • 3
  • Junkai Wu
    • 2
  • Jiuxin Zhao
    • 2
  1. 1.A*STAR Institute of High Performance ComputingSingapore
  2. 2.Hong Kong University of Science and TechnologyChina
  3. 3.BGI-ShenzhenChina

Personalised recommendations