Influence of Tree Topology Restrictions on the Complexity of Haplotyping with Missing Data

  • Michael Elberfeld
  • Ilka Schnoor
  • Till Tantau
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5532)


Haplotyping, also known as haplotype phase prediction, is the problem of predicting likely haplotypes from genotype data. One fast haplotyping method is based on an evolutionary model where a perfect phylogenetic tree is sought that explains the observed data. Unfortunately, when data entries are missing as is often the case in laboratory data, the resulting incomplete perfect phylogeny haplotyping problem ipph is NP-complete and no theoretical results are known concerning its approximability, fixed-parameter tractability, or exact algorithms for it. Even radically simplified versions, such as the restriction to phylogenetic trees consisting of just two directed paths from a given root, are still NP-complete; but here a fixed-parameter algorithm is known. We show that such drastic and ad hoc simplifications are not necessary to make ipph fixed-parameter tractable: We present the first theoretical analysis of an algorithm, which we develop in the course of the paper, that works for arbitrary instances of ipph. On the negative side we show that restricting the topology of perfect phylogenies does not always reduce the computational complexity: while the incomplete directed perfect phylogeny problem is well-known to be solvable in polynomial time, we show that the same problem restricted to path topologies is NP-complete.


Tree Topology Node Label Light Component Mutation Tree Tree Record 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bafna, V., Gusfield, D., Lancia, G., Yooseph, S.: Haplotyping as perfect phylogeny: A direct approach. J. Comput. Biol. 10(3–4), 323–340 (2003)CrossRefGoogle Scholar
  2. 2.
    Benham, C.J., Kannan, S., Paterson, M., Warnow, T.: Hen’s teeth and whale’s feet: Generalized characters and their compatibility. J. Comput. Biol. 2(4), 515–525 (1995)CrossRefGoogle Scholar
  3. 3.
    Bonizzoni, P.: A linear-time algorithm for the perfect phylogeny haplotype problem. Algorithmica 48(3), 267–285 (2007)zbMATHCrossRefMathSciNetGoogle Scholar
  4. 4.
    Clark, A.G.: Inference of haplotypes from PCR-amplified samples of diploid populations. J. of Mol. Biol. and Evol. 7(2), 111–122 (1990)Google Scholar
  5. 5.
    Ding, Z., Filkov, V., Gusfield, D.: A linear-time algorithm for the perfect phylogeny haplotyping (PPH) problem. J. Comput. Biol. 13(2), 522–553 (2006)CrossRefMathSciNetGoogle Scholar
  6. 6.
    Elberfeld, M., Schnoor, I., Tantau, T.: Influence of tree topology restrictions on the complexity of haplotyping with missing data. Tech. Rep. SIIM-TR-A-08-05, Universität zu Lübeck (2008)Google Scholar
  7. 7.
    Elberfeld, M., Tantau, T.: Computational complexity of perfect-phylogeny-related haplotyping problems. In: Ochmański, E., Tyszkiewicz, J. (eds.) MFCS 2008. LNCS, vol. 5162, pp. 299–310. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  8. 8.
    Eskin, E., Halperin, E., Karp, R.M.: Efficient reconstruction of haplotype structure via perfect phylogeny. J. of Bioinform. and Comput. Biol. 1(1), 1–20 (2003)CrossRefGoogle Scholar
  9. 9.
    Excoffier, L., Slatkin, M.: Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population. Mol. Biol. and Evol. 12(5), 921–927 (1995)Google Scholar
  10. 10.
    Gramm, J., Hartman, T., Nierhoff, T., Sharan, R., Tantau, T.: On the complexity of SNP block partitioning under the perfect phylogeny model. Discrete Math. (2008) (to appear), doi:010.1016/j.disc.2008.04.002Google Scholar
  11. 11.
    Gramm, J., Nierhoff, T., Sharan, R., Tantau, T.: Haplotyping with missing data via perfect path phylogenies. Discrete and Appl. Math. 155(6-7), 788–805 (2007)zbMATHCrossRefMathSciNetGoogle Scholar
  12. 12.
    Gusfield, D.: Inference of haplotypes from samples of diploid populations: Complexity and algorithms. J. Comput. Biol. 8(3), 305–323 (2001)CrossRefGoogle Scholar
  13. 13.
    Gusfield, D.: Haplotyping as perfect phylogeny: Conceptual framework and efficient solutions. In: Proc. RECOMB 2002, pp. 166–175. ACM Press, New York (2002)CrossRefGoogle Scholar
  14. 14.
    Halperin, E., Karp, R.M.: Perfect phylogeny and haplotype assignment. In: Proc. RECOMB 2002, pp. 10–19. ACM Press, New York (2004)Google Scholar
  15. 15.
    Kimmel, G., Shamir, R.: The incomplete perfect phylogeny haplotype problem. J. Bioinform. and Comput. Biol. 3(2), 359–384 (2005)CrossRefGoogle Scholar
  16. 16.
    Liu, Y., Zhang, C.-Q.: A linear solution for haplotype perfect phylogeny problem. In: Proc. Int. Conf. Adv. in Bioinform. and Appl., pp. 173–184. World Scientific, Singapore (2005)CrossRefGoogle Scholar
  17. 17.
    Pe’er, I., Pupko, T., Shamir, R., Sharan, R.: Incomplete directed perfect phylogeny. SIAM J. Comput. 33(3), 590–607 (2004)zbMATHCrossRefMathSciNetGoogle Scholar
  18. 18.
    Vijaya Satya, R., Mukherjee, A.: An optimal algorithm for perfect phylogeny haplotyping. J. Comput. Biol. 13(4), 897–928 (2006)CrossRefMathSciNetGoogle Scholar
  19. 19.
    Vijaya Satya, R., Mukherjee, A.: The undirected incomplete perfect phylogeny problem. IEEE/ACM T. Comput. Biol. and Bioinform. 5(4), 618–629 (2008)CrossRefGoogle Scholar
  20. 20.
    Steel, M.: The complexity of reconstructing trees from qualitative characters and subtrees. J. Classif. 9(1), 91–116 (1992)zbMATHCrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Michael Elberfeld
    • 1
  • Ilka Schnoor
    • 1
  • Till Tantau
    • 1
  1. 1.Institut für Theoretische InformatikUniversität zu LübeckLübeckGermany

Personalised recommendations