Abstract
There has been considerable recent interest in the use of haplotype structure to aid in the design and analysis of case-control association studies searching for genetic predictors of human disease. The use of haplotype structure is based on the premise that genetic variations that are physically close on the genome will often be predictive of one another due to their frequent descent intact through recent evolution. Understanding these correlations between sites should make it possible to minimize the amount of redundant information gathered through assays or examined in association tests, improving the power and reducing the cost of the studies. In this work, we evaluate the potential value of haplotype structure in this context by applying it to two key sub-problems: inferring hidden polymorphic sites in partial haploid sequences and choosing subsets of variants that optimally capture the information content of the full set of sequences. We develop methods for these approaches based on a prior method we developed for predicting piece-wise shared ancestry of haploid sequences. We apply these methods to a case study of two genetic regions with very different levels of sequence diversity. We conclude that haplotype correlations do have considerable potential for these problems, but that the degree to which they are useful will be strongly dependent on the population sizes available and the specifics of the genetic regions examined.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Baum, L.E., Petrie, T., Soules, G., Weiss, N.: A maximization technique occuring in the statistical analysis of probability functions of Markov chains. Annals Math. Stat. 41, 164–171 (1970)
Cardon, L.R., Bell, J.I.: Association study designs for complex diseases. Nature Reviews Genetics 2, 91–99 (2001)
Chapman, N.H., Thompson, E.A.: The effect of population history on the lengths of ancestral chromosome segments. Genetics 162, 449–458 (2002)
Daly, M.J., Rioux, J.D., Schaffner, S.F., Hudson, T.J.: High-resolution haplotype structure in the human genome. Nat. Genet. 29, 229–232 (2001)
Fullerton, S.M., Clark, A.G., Weiss, K.M., Nickerson, D.A., Taylor, S.L., Stengaard, J.H., Salomaa, V., Vartiainen, E., Perola, M., Boerwinkle, E., Sing, C.F.: Apolipoprotein E variation at the sequence haplotype level: implications for the origins and maintenance of a major human polymorphism. Am. J. Hum. Gen. 67, 881–900 (2000)
Gabriel, S.B., Schaffner, S.F., Nguyen, H., Moore, J.M., Roy, J., Blumenstiel, B., Higgins, J., DeFelice, M., Lochner, A., Faggart, M., Liu-Cordero, S.N., Rotimi, C., Adeyemo, A., Cooper, R., Ward, R., Lander, E.S., Daly, M.J., Altschuler, D.: The structure of haplotype blocks in the human genome. Science 296, 2225–2229 (2002)
Griffiths, R.C., Marjoram, P.: Ancestral inference from samples of DNA sequence with recombination. Journal of Computational Biology 3/4, 479–502 (1996)
International Human Genome Sequencing Consortium. Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001)
Jeffreys, A.J., Kauppi, L., Neumann, R.: Intensely punctute meiotic recombination in the class II region of the major histocompatibility complex. Nat. Genet. 29, 217–222 (2001)
Johnson, G.C., Esposito, L., Barret, B.J., Smith, A.N., Heward, J., Di Genova, G., Ueda, H., Cordell, H.J., Eaves, I.A., Dudbrigde, F., Twells, R.C., Payne, F., Hughes, W., Nutland, S., Stevens, H., Carr, P., Tuomilehto-Wolf, E., Tuomilehto, J., Gough, S.C., Clayton, D.G., Todd, J.A.: Haplotype tagging for the identification of common disease genes. Nat. Genet. 29, 233–237 (2001)
Kececioglu, J., Gusfield, D.: Reconstructing a history of recombinations from a set of sequences. In: Proceedings of the Fifth ACM-SIAM Symposium on Discrete Algorithms, pp. 471–480 (1994)
Liu, J.S., Sabatii, C., Teng, J., Keats, B.J.B., Risch, N.: Bayesian analysis of haplotypes for linkage disequilibrium mapping. Genome Res. 11, 1716–1724 (2001)
Lohmueller, K.E., Pearce, C.L., Pike, M., Lander, E.S., Hirschhorn, J.N.: Meta-analysis of genetic association studies supports a contribution of common variants to susceptibility to common disease. Nat. Genet. 33, 177–182 (2003)
McPeek, M.S., Strahs, A.: Assessment of linkage disequilibrium by the decay of haplotype sharing, with application to fine-scale genetic mapping. Am. J. Hum. Gen. 65, 858–875 (1999)
Morris, A.P., Whittaker, J.C., Balding, D.J.: Bayesian fine-scale mapping of disease loci, by hidden Markov models. Am. J. Hum. Gen. 67, 155–169 (2000)
Nickerson, D.A., Taylor, S.L., Fullerton, S.M., Weiss, K.M., Clark, A.G., Stengaard, J.H., Salomaa, V., Boerwinkle, E., Sing, C.F.: Sequence diversity and large-scale typing of SNPs in the human apolipoprotein E gene. Genome Res. 10, 1532–1545 (2000)
Nickerson, D.A., Taylor, S.L., Weiss, K.M., Clark, A.G., Hutchinson, R.G., Stengaard, J.H., Salomaa, V., Vartiainen, E., Boerwinkle, E., Sing, C.F.: DNA sequence diversity in a 9.7-kb region of the human lipoprotein lipase gene. Nat. Genet. 19, 233–240 (1998)
Patil, N., Berno, A.J., Hinds, D.A., Barrett, W.A., Doshi, J.M., Hacker, C.R., Kautzer, C.R., Lee, D.H., Marjoribanks, C., McDonough, D.P., Nguyen, B.T., Norris, M.C., Sheehan, J.B., Shen, N., Stern, D., Stokowski, R.P., Thomas, D.J., Trulson, M.O., Vyas, K.R., Frazer, K.A., Fodor, S.P., Cox, D.R.: Blocks of limited haplotype diversity revealed by high resolution scanning of human chromosome 21. Science 294, 1719–1722 (2001)
Risch, N.J., Merikangas, K.R.: The future of genetic studies of complex human diseases. Science 273, 1516–1517 (1996)
Schwartz, R., Clark, A.G., Istrail, S.: Methods for inferring block-wise ancestral history from haploid sequences: The haplotype coloring problem. In: Guigó, R., Gusfield, D. (eds.) WABI 2002. LNCS, vol. 2452, pp. 44–59. Springer, Heidelberg (2002)
Service, S.K., Temple Lang, D.W., Freimer, N.B., Sandkuijl, L.A.: Linkage disequilibrium mapping of disease genes by reconstruction of ancestral haplotypes in founder populations. Am. J. Hum. Gen. 64, 1728–1738 (1999)
Stephens, M., Smith, N.J., Donnelly, P.: A new statistical method for haplotype reconstruction from population data. American Journal of Human Genetics 68, 978–989 (2001)
Ukkonen, E.: Finding founder sequences from a set of recombinants. In: Guigó, R., Gusfield, D. (eds.) WABI 2002. LNCS, vol. 2452, pp. 277–286. Springer, Heidelberg (2002)
Venter, G., Adams, M.A., Myers, E.W., et al.: The sequence of the human genome. Science 291, 1304–1351 (2001)
Wang, L., Zhang, K., Zhang, L.: Perfect phylogenetic networks with recombination. Journal of Computational Biology 8, 69–78 (2002)
Wiuf, C., Hein, J.: On the number of ancestors to a DNA sequence. Genetics 147, 1459–1468 (1997)
Wiuf, C., Hein, J.: The ancestry of a sample of sequences subject to recombination. Genetics 151, 1217–1228 (1999)
Wu, S., Gu, X.: A greedy algorithm for optimal recombination. In: Wang, J. (ed.) COCOON 2001. LNCS, vol. 2108, pp. 86–90. Springer, Heidelberg (2001)
Zhang, K., Calabrese, P., Nordborg, M., Sun, F.: Haplotype block structure and its applications to association studies: Power and study designs. Am. J. Hum. Gen. 71, 1386–1394 (2002)
Zhang, K., Deng, M., Chen, T., Waterman, M.S., Sun, F.: A dynamic programming algorithm for haplotype block partitioning. Proc. Natl. Acad. Sci. USA 99, 7335–7339 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Schwartz, R., Clark, A.G., Istrail, S. (2004). Inferring Piecewise Ancestral History from Haploid Sequences. In: Istrail, S., Waterman, M., Clark, A. (eds) Computational Methods for SNPs and Haplotype Inference. RSNPsH 2002. Lecture Notes in Computer Science(), vol 2983. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24719-7_5
Download citation
DOI: https://doi.org/10.1007/978-3-540-24719-7_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-21249-2
Online ISBN: 978-3-540-24719-7
eBook Packages: Springer Book Archive