On the Complexity of Several Haplotyping Problems

  • Rudi Cilibrasi
  • Leo van Iersel
  • Steven Kelk
  • John Tromp
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3692)


We present several new results pertaining to haplotyping. The first set of results concerns the combinatorial problem of reconstructing haplotypes from incomplete and/or imperfectly sequenced haplotype data. More specifically, we show that an interesting, restricted case of Minimum Error Correction (MEC) is NP-hard, question earlier claims about a related problem, and present a polynomial-time algorithm for the ungapped case of Longest Haplotype Reconstruction (LHR). Secondly, we present a polynomial time algorithm for the problem of resolving genotype data using as few haplotypes as possible (the Pure Parsimony Haplotyping Problem, PPH) where each genotype has at most two ambiguous positions, thus solving an open problem posed by Lancia et al in [15].


Bipartite Graph Input Matrix Maximum Match Ambiguous Position Parity Haplotype 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Alon, N., Sudakov, B.: On Two Segmentation Problems. Journal of Algorithms 33, 173–184 (1999)zbMATHCrossRefMathSciNetGoogle Scholar
  2. 2.
    Ausiello, G., Crescenzi, P., Gambosi, G., Kann, V., Marchetti-Spaccamela, A., Protasi, M.: Complexity and Approximation - Combinatorial optimization problems and their approximability properties. Springer, Heidelberg (1999)zbMATHGoogle Scholar
  3. 3.
    Bafna, V., Istrail, S., Lancia, G., Rizzi, R.: Polynomial and APX-hard cases of the individual haplotyping problem. Theoretical Computer Science (2004)Google Scholar
  4. 4.
    Bonizzoni, P., Della Vedova, G., Dondi, R., Li, J.: The Haplotyping Problem: An Overview of Computational Models and Solutions. Journal of Computer Science and Technology 18(6), 675–688 (2003)zbMATHCrossRefMathSciNetGoogle Scholar
  5. 5.
    Drineas, P., Frieze, A., Kannan, R., Vempala, S., Vinay, V.: Clustering in large graphs via Singular Value Decomposition. Journal of Machine Learning 56, 9–33 (2004)zbMATHCrossRefGoogle Scholar
  6. 6.
    Gavril, F.: Testing for equality between maximum matching and minimum node covering. Information processing letters 6, 199–202 (1977)zbMATHCrossRefMathSciNetGoogle Scholar
  7. 7.
    Greenberg, H.J., Hart, W.E., Lancia, G.: Opportunities for Combinatorial Optimisation in Computational Biology. INFORMS Journal on Computing 16(3), 211–231 (2004)CrossRefMathSciNetGoogle Scholar
  8. 8.
    Halldorsson, B.V., Bafna, V., Edwards, N., Lippert, R., Yooseph, S., Istrail, S.: A Survey of Computational Methods for Determining Haplotypes. In: Istrail, S., Waterman, M.S., Clark, A. (eds.) DIMACS/RECOMB Satellite Workshop 2002. LNCS (LNBI), vol. 2983, pp. 26–47. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  9. 9.
    Hopcroft, J.E., Karp, R.M.: An n 5/2 algorithm for maximum matching in bipartite graphs. SIAM Journal on Computing 2, 225–231 (1973)zbMATHCrossRefMathSciNetGoogle Scholar
  10. 10.
    Jiao, Y., Xu, J., Li, M.: On the k-Closest Substring and k-Consensus Pattern Problems. In: Sahinalp, S.C., Muthukrishnan, S.M., Dogrusoz, U. (eds.) CPM 2004. LNCS, vol. 3109, pp. 130–144. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  11. 11.
    Kleinberg, J., Papadimitriou, C., Raghavan, P.: Segmentation Problems. In: Proceedings of STOC 1998, pp. 473–482 (1998)Google Scholar
  12. 12.
    Kleinberg, J., Papadimitriou, C., Raghavan, P.: A Microeconomic View of Data Mining. Data Mining and Knowledge Discovery 2, 311–324 (1998)CrossRefGoogle Scholar
  13. 13.
    Kleinberg, J., Papadimitriou, C., Raghavan, P.: Segmentation Problems. Journal of the ACM 51(2), 263–280 (2004); Note: this paper is somewhat different to the 1998 version Google Scholar
  14. 14.
    Lancia, G., Bafna, V., Istrail, S., Lippert, R., Schwartz, R.: SNPs Problems, Complexity and Algorithms. In: Proceedings of the 9th Annual European Symposium on Algorithms, pp. 182–193 (2001)Google Scholar
  15. 15.
    Lancia, G., Pinotti, M.C., Rizzi, R.: Haplotyping Populations by Pure Parsimony: Complexity of Exact and Approximation Algorithms. INFORMS Journal on Computing 16(4), 348–359 (2004)CrossRefMathSciNetGoogle Scholar
  16. 16.
    Lancia, G., Rizzi, R.: A polynomial solution to a special case of the parsimony haplotyping problem, to appear in Operations Research LettersGoogle Scholar
  17. 17.
    Ostrovsky, R., Rabani, Y.: Polynomial-Time Approximation Schemes for Geometric Min-Sum Median Clustering. Journal of the ACM 49(2), 139–156 (2002)CrossRefMathSciNetGoogle Scholar
  18. 18.
    Panconesi, A., Sozio, M.: Fast Hare: A Fast Heuristic for Single Individual SNP Haplotype Reconstruction. In: Jonassen, I., Kim, J. (eds.) WABI 2004. LNCS (LNBI), vol. 3240, pp. 266–277. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  19. 19.
    Personal communication with Christos H. Papadimitriou (June 2005)Google Scholar
  20. 20.
    Rizzi, R., Bafna, V., Istrail, S., Lancia, G.: Practical Algorithms and Fixed-Parameter Tractability for the Single Individual SNP Haplotyping Problem. In: Guigó, R., Gusfield, D. (eds.) WABI 2002. LNCS, vol. 2452, pp. 29–43. Springer, Heidelberg (2002)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Rudi Cilibrasi
    • 2
  • Leo van Iersel
    • 1
  • Steven Kelk
    • 2
  • John Tromp
    • 2
  1. 1.Technische Universiteit Eindhoven (TU/e)EindhovenNetherlands
  2. 2.Centrum voor Wiskunde en Informatica (CWI)AmsterdamNetherlands

Personalised recommendations