Efficient Algorithms for SNP Haplotype Block Selection Problems

  • Yaw-Ling Lin
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5092)


Global patterns of human DNA sequence variation (haplotypes) defined by common single nucleotide polymorphisms (SNPs) have important implications for identifying disease associations and human traits. Recent genetics research reveals that SNPs within certain haplotype blocks induce only a few distinct common haplotypes in the majority of the population. The existence of haplotype block structure has serious implications for association-based methods for the mapping of disease genes. Our ultimate goal is to select haplotype block designations that best capture the structure within the data.

Here in this paper we propose several efficient combinatorial algorithms related to selecting interesting haplotype blocks under different diversity functions that generalizes many previous results in the literatures. In particular, given an m×n haplotype matrix A, we show linear time algorithms for finding all interval diversities, farthest sites, and the longest block within A. For selecting the multiple long blocks with diversity constraint, we show that selecting k blocks with longest total length can be be found in O(nk) time. We also propose linear time algorithms in calculating the all intra-longest-blocks and all intra-k-longest-blocks.


Diversity Function Haplotype Block Linear Time Algorithm Diversity Constraint Output Size 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Anderson, E.C., Novembre, J.: Finding Haplotype Block Boundaries by Using the Minimum-Description-Length Principle. Am. J. of Human Genetics 73, 336–354 (2003)CrossRefGoogle Scholar
  2. 2.
    Cole, R., Farach, M., Hariharan, R., Przytycka, T., Thorup, M.: An O(n logn) Algorithm for the Maximum Agreement Subtree Problem for Binary Trees. SIAM Journal on Computing 30(5), 1385–1404 (2002)CrossRefGoogle Scholar
  3. 3.
    Daly, M., Rioux, J., Schafiner, S., Hudson, T., Lander, E.: Highresolution Haplotype Structure in the Human Genome. Nature Genetics 29, 229–232 (2001)CrossRefGoogle Scholar
  4. 4.
    Dawson, E., Abecasis, G., et al.: A First-Generation Linkage Disequilibrium Map of Human Dhromosome 22. Nature 418, 544–548 (2002)CrossRefGoogle Scholar
  5. 5.
    Gabriel, S.B., Schaffner, S.F., Nguyen, H., et al.: The Structure of Haplotype Blocks in the Human Genome. Science 296(5576), 2225–2229 (2002)CrossRefGoogle Scholar
  6. 6.
    Greenspan, G., Geiger, D.: Model-Based Inference of Haplotype Block Variation. In: Seventh Annual International Conference on Computational Molecular Biology (2003)Google Scholar
  7. 7.
    Gusfield, D.: Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. Cambridge University Press, Cambridge (1997)zbMATHGoogle Scholar
  8. 8.
    International HapMap Project,
  9. 9.
    Harel, D., Tarjan, R.E.: Fast Algorithms for Finding Nearest Common Ancestors. SIAM Journal on Computing 13(2), 338–355 (1984)zbMATHCrossRefMathSciNetGoogle Scholar
  10. 10.
    Hudson, R.R., Kaplan, N.L.: Statistical Properties of the Number of Recombination Events in the History of a Sample of DNA Sequences. Genetics 111, 147–164 (1985)Google Scholar
  11. 11.
    Li, W.H., Graur, D.: Fundamentals of Molecular Evolution. Sinauer Associates, Inc. (1991)Google Scholar
  12. 12.
    Patil, N., Berno, A.J., Hinds, D.A., et al.: Blocks of Limited Haplotype Diversity Revealed by High Resolution Scanning of Human Chromosome 21. Science 294, 1719–1723 (2001)CrossRefGoogle Scholar
  13. 13.
    Reich, D., Cargill, M., Lander, E., et al.: Linkage Disequilibrium in the Human Genome. Nature 411, 199–204 (2001)CrossRefGoogle Scholar
  14. 14.
    Ukkonen, E.: On-Line Construction of Suffix Trees. Algorithmica 14(3), 249–260 (1995)zbMATHCrossRefMathSciNetGoogle Scholar
  15. 15.
    Zhang, K., Qin, Z., Chen, T., Liu, J.S., Waterman, M.S., Sun, F.: HapBlock: Haplotype Block Partitioning and Tag SNP Selection Software Using a Set of Dynamic Programming Algorithms. Bioinformatics 21(1), 131–134 (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Yaw-Ling Lin
    • 1
  1. 1.Dept. Computer Science and Information EngineeringProvidence UniversityTaiwan

Personalised recommendations