HCV Quasispecies Assembly Using Network Flows

  • Kelly Westbrooks
  • Irina Astrovskaya
  • David Campo
  • Yury Khudyakov
  • Piotr Berman
  • Alex Zelikovsky
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4983)


Understanding how the genomes of viruses mutate and evolve within infected individuals is critically important in epidemiology. By exploiting knowledge of the forces that guide viral microevolution, researchers can design drugs and treatments that are effective against newly evolved strains. Therefore, it is critical to develop a method for typing the genomes of all of the variants of a virus (quasispecies) inside an infected individual cell.

In this paper, we focus on sequence assembly of Hepatitis C Virus (HCV) based on 454 Lifesciences system that produces around 250K reads each 100-400 base long. We introduce several formulations of the quasispecies assembly problem and a measure of the assembly quality. We also propose a novel scalable assembling method for quasispecies based on a novel network flow formulation. Finally, we report the results of assembling 44 quasispecies from the 1700 bp long E1E2 region of HCV.


Problem Instance Directed Acyclic Graph Network Flow Switching Error Consensus Genome 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Von Hahn, T., Yoon, J.C., Alter, H., Rice, C.M., Rehermann, B., Balfe, P., Mckeating, J.A.: Hepatitis C Virus Continuously Escapes From Neutralizing Antibody and T-Cell Responses During Chronic Infection In Vivo. Gastroenterology 132, 667–678 (2007)CrossRefGoogle Scholar
  2. 2.
    Myers, G.: Building Fragment Assembly String Graphs. In: European Conf. on Computational Biology, pp. 79–85 (2005)Google Scholar
  3. 3.
    Lippert, R., Schwartz, R., Lancia, G., Istrail, S.: Algorithmic strategies for the single nucleotide polymorphism haplotype assembly problem. Briefings in Bioinformatics 3(1), 23–31 (2002)CrossRefGoogle Scholar
  4. 4.
    Alekseyev, M.A., Pevzner, P.A.: Colored de Bruijn graphs and the genome halving problem. IEEE/ACM Trans Comput Biol Bioinform. 4(1), 98–107Google Scholar
  5. 5.
    Chaisson, M.J., Pevzner, P.A.: Short read fragment assembly of bacterial genomes. Genome research (to appear, 2007)Google Scholar
  6. 6.
    Sundquist, A., Ronaghi, M., Tang, H., Pevzner, P., Batzoglou, S.: Whole-genome sequencing and assembly with high-throughput, short-read technologies. PLoS ONE 2(5), e484 (2007)CrossRefGoogle Scholar
  7. 7.
    Brinza, D., Zelikovsky, A.: 2SNP: Scalable Phasing Based on 2-SNP Haplotypes. Bioinformatics 22(3), 371–373 (2006)CrossRefGoogle Scholar
  8. 8.
    454 Lifescience (2007),
  9. 9.
    Margulies, M., et al.: Genome sequencing in microfabricated high-density picolitre reactors. Nature 437(7057), 376–380 (2005)Google Scholar
  10. 10.
    Albert, R., DasGupta, B., Dondi, R., Sontag, E., Zelikovsky, A., Westbrooks, K.: Signal Transduction Network Inference from Indirect Experimental Evidence. Journal of Computational Biology 14(7), 927–949 (2007)CrossRefMathSciNetGoogle Scholar
  11. 11.
    Goldberg, A.: An Effcient Implementation of a Scaling Minimum-Cost Flow Algorithm. Journal of Algorithms 22(1), 1–29 (1997)CrossRefMathSciNetGoogle Scholar
  12. 12.
    GNU Linear Programming Kit,
  13. 13.
  14. 14.
    IG Systems CS2 Software (2007),
  15. 15.
    Venter, J.C., Remington, K., Heidelberg, J.F., Halpern, A.L., Rusch, D., Eisen, J.A., Wu, D., Paulsen, I., Nelson, K.E., Nelson, W., et al.: Environmental genome shotgun sequencing of the Sargasso Sea. Science 304, 66–74 (2004)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Kelly Westbrooks
    • 1
  • Irina Astrovskaya
    • 1
  • David Campo
    • 2
  • Yury Khudyakov
    • 2
  • Piotr Berman
    • 3
  • Alex Zelikovsky
    • 1
  1. 1.Department of Computer ScienceGeorgia State UniversityAtlanta 
  2. 2.Centers for Disease Control and PreventionAtlanta 
  3. 3.Department of Computer Science and EngineeringPennsylvania State UniversityUniversity Park 

Personalised recommendations