Haplotype Inferring Via Galled-Tree Networks Is NP-Complete

  • Arvind Gupta
  • Ján Maňuch
  • Ladislav Stacho
  • Xiaohong Zhao
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5092)


The problem of determining haplotypes from genotypes has gained considerable prominence in the research community since the beginning of the HapMap project. Here the focus is on determining the sets of SNP values of individual chromosomes (haplotypes), since such information better captures the genetic causes of diseases. One of the main algorithmic tools for haplotyping is based on the assumption that the evolutionary history for the original haplotypes satisfies perfect phylogeny. The algorithm can be applied only on individual blocks of chromosomes, in which it is assumed that recombinations either do not happen or happen with small frequencies. However, exact determination of blocks is usually not possible. It would be desirable to develop a method for haplotyping which can account for recombinations, and thus can be applied on multiblock sections of chromosomes. A natural candidate for such a method is haplotyping via phylogenetic networks or their simplified version: galled-tree networks, which were introduced by Wang, Zhang, Zhang ([25]) to model recombinations. However, even haplotyping via galled-tree networks appears hard, as the algorithms exist only for very special cases: the galled-tree network has either a single gall ([23]) or only small galls with two mutations each ([8]). Building on our previous results ([6]) we show that, in general, haplotyping via galled-tree networks is NP-complete, and thus indeed hard.


True Assignment Phylogenetic Network Graph Covering Haplotype Inference Perfect Phylogeny 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bordewich, M., Semple, C.: On the Computational Complexity of the Rooted Subtree Prune and Regraft Distance. Annals of Combinatorics 8, 409–423 (2004)zbMATHCrossRefMathSciNetGoogle Scholar
  2. 2.
    Clark, A.: Inference of Haplotypes from PCR-Amplified Samples of Dipoid Populations. Molecular Biology and Evolution 7, 111–122 (1990)Google Scholar
  3. 3.
    Consortium, I.H.: A Haplotype Map of the Human Genome. Nature 437, 1299–1320 (2005)CrossRefGoogle Scholar
  4. 4.
    Daly, M., Rioux, J., Schaffner, S., Hudson, T., Lander, E.: High-Resolution Haplotype Structure in the Human Genome. Nature Genetics 29(2), 229–232 (2001)CrossRefGoogle Scholar
  5. 5.
    Gabriel, S., Schaffner, S., Nguyen, H., Moore, J., Roy, J., Blumenstiel, B., Higgins, J., DeFelice, M., Lochner, A., Faggart, M., Liu-Cordero, S., Rotimi, C., Adeyemo, A., Cooper, R., Ward, R., Lander, E., Daly, M., Altshuler, D.: The Structure of Haplotype Blocks in the Human Genome. Science 296 (2002)Google Scholar
  6. 6.
    Gupta, A., Manuch, J., Stacho, L., Zhao, X.: Haplotype Inferring via Galled-Tree Networks Using a Hypergraph Covering Problem for Special Genotype Matrices. Discr. Appl. Math. (to appear) Google Scholar
  7. 7.
    Gupta, A., Manuch, J., Stacho, L., Zhao, X.: Characterization of the Existence of Galled-Tree Networks. J. of Bioinform. and Comp. Biol. 4(6), 1309–1328 (2006)CrossRefGoogle Scholar
  8. 8.
    Gupta, A., Manuch, J., Stacho, L., Zhao, X.: Algorithm for Haplotype Inferring via Galled-Tree Networks with Simple Galls (extended abstract). In: Istrail, S., Pevzner, P., Waterman, M. (eds.) ISBRA 2007. LNCS (LNBI), vol. 4463, pp. 121–132. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  9. 9.
    Gusfield, D.: Inference of Haplotypes from Samples of Diploid Populations: Complexity and Algorithms. J. Comp. Biology 8(3), 305–323 (2001)CrossRefGoogle Scholar
  10. 10.
    Gusfield, D.: Haplotyping as Perfect Phylogeny: Conceptual Framework and Efficient Solutions. In: Proceedings of the Sixth Annual International Conference on Computational Biology (RECOMB 2002), pp. 166–175 (2002)Google Scholar
  11. 11.
    Gusfield, D.: Haplotype Inference by Pure Parsimony. In: Baeza-Yates, R., Chávez, E., Crochemore, M. (eds.) CPM 2003. LNCS, vol. 2676, pp. 144–155. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  12. 12.
    Gusfield, D.: Optimal, Efficient Reconstruction of Root-Unknown Phylogenetic Networks with Constrained and Structured Recombination. J. Comput. Syst. Sci. 70(3), 381–398 (2005)zbMATHCrossRefMathSciNetGoogle Scholar
  13. 13.
    Gusfield, D., Eddhu, S., Langley, C.: The Fine Structure of Galls in Phylogenetic Networks. INFORMS Journal on Computing 16(4), 459–469 (2004)CrossRefMathSciNetGoogle Scholar
  14. 14.
    Gusfield, D., Eddhu, S., Langley, C.: Optimal, Efficient Reconstruction of Phylogenetic Networks with Constrained Recombination. Journal of Bioinformatics and Computational Biology 2(1), 173–213 (2004)CrossRefGoogle Scholar
  15. 15.
    Gusfield, D., Orzack, S.H.: Handbook of Computational Molecular Biology, Chapter Haplotype Inference. CRC Computer and Information Science Series, p. 18C1C18C28. Chapman & Hall, Boca Raton (2005)Google Scholar
  16. 16.
    Helmuth, L.: Genome Research: Map of the Human Genome 3.0. Science 293(5530), 583–585 (2001)CrossRefGoogle Scholar
  17. 17.
    Lancia, G., Pinotti, C., Rizzi, R.: Haplotyping Populations: Complexity and Aproximations. Dit-02-082, University of Trento (2002)Google Scholar
  18. 18.
    Mitra, R.D., Butty, V.L., Shendure, J., Williams, B.R., Housman, D.E., Church, G.M.: Digital Genotyping and Haplotyping with Polymerase Colonies. Proceedings of the Nationlal Academy of Sciences of the United States of America 100, 5926–5931 (2003)CrossRefGoogle Scholar
  19. 19.
    Papadimitriou, C.H.: Computational Complexity. Addison-Wiesley Publishing Company, Inc. (1994)Google Scholar
  20. 20.
    Patil, N., Berno, A., Hinds, D., Barrett, W., Doshi, J., Hacker, C., Kautzer, C., Lee, D., Marjoribanks, C., McDonough, D., Nguyen, B., Norris, M., Sheehan, J., Shen, N., Stern, D., Stokowski, R., Thomas, D., Trulson, M., Vyas, K., Frazer, K., Fodor, S., Cox, D.: Blocks of Limited Haplotype Diversity Revealed by High-Resolution Scanning of Human Chromosome 21. Science 294(5547), 1719–1723 (2001)CrossRefGoogle Scholar
  21. 21.
    Pennisi, E.: BREAKTHROUGH OF THE YEAR: Human Genetic Variation. Science 318(5858), 1842–1843 (2007)CrossRefGoogle Scholar
  22. 22.
    Song, Y.S.: A Concise Necessary and Sufficient Condition for the Existence of a Galled-Tree. IEEE/ACM Transaction on Computational Biology and Bioinformatics 3(2), 186–191 (2006)CrossRefGoogle Scholar
  23. 23.
    Song, Y.S., Wu, Y., Gusfield, D.: Algorithms for Imperfect Phylogeny Haplotyping (IPPH) with a Single Homoplasy or Recombination Event. In: Casadio, R., Myers, G. (eds.) WABI 2005. LNCS (LNBI), vol. 3692, pp. 152–164. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  24. 24.
    Thorisson, G., Smith, A., Krishnan, L., Stein, L.: The International HapMap Project Web Site. Genome Research 15, 1591–1593 (2005)CrossRefGoogle Scholar
  25. 25.
    Wang, L., Zhang, K., Zhang, L.: Perfect Phylogenetic Networks with Recombination. Journal of Computational Biology 8(1), 69–78 (2001)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Arvind Gupta
    • 1
  • Ján Maňuch
    • 1
  • Ladislav Stacho
    • 1
  • Xiaohong Zhao
    • 1
  1. 1.School of Computing Science and Department of MathematicsSimon Fraser UniversityCanada

Personalised recommendations