Abstract
Most algorithms for reconstruction of evolutionary histories involving large-scale events such as duplications, deletions or rearrangements, work on sequences of predetermined markers, for example protein coding genes or other functional elements. However, markers defined in this way ignore information included in non-coding sequences, are prone to errors in annotation, and may even introduce artifacts due to partial gene copies or chimeric genes.
We propose the problem of sequence segmentation where the goal is to automatically select suitable markers based on sequence homology alone. We design an algorithm for this problem which can tolerate certain amount of inaccuracies in the input alignments and still produce segmentation of the sequence to markers with high coverage and accuracy. We test our algorithm on several artificial and real data sets representing complex clusters of segmental duplications. Our software is available at http://compbio.fmph.uniba.sk/atomizer/
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Adam, Z., Sankoff, D.: The ABCs of MGR with DCJ. Evolutionary Bioinformatics Online 4, 69–74 (2008)
Bellemare, J., Rouleau, M., Girard, H., Harvey, M., Guillemette, C.: Alternatively spliced products of the UGT1A gene interact with the enzymatically active proteins to inhibit glucuronosyltransferase activity in vitro. Drug Metabolism and Disposition 38(10), 1785–1789 (2010)
Benson, G., Dong, L.: Reconstructing the duplication history of a tandem repeat. In: Intelligent Systems for Molecular Biology (ISMB), pp. 44–53 (1999)
Bertrand, D., Gascuel, O.: Topological rearrangements and local search method for tandem duplication trees. IEEE/ACM Transactions on Computational Biology and Bioinformatics 2(1), 15–28 (2005)
Bourque, G., Pevzner, P.A.: Genome-scale evolution: reconstructing gene orders in the ancestral species. Genome Research 12(1), 26–36 (2002)
Elemento, O., Gascuel, O., Lefranc, M.-P.: Reconstructing the duplication history of tandemly repeated genes. Molecular Biology and Evolution 19(3), 278–278 (2002)
Fitch, W.M.: Phylogenies constrained by the crossover process as illustrated by human hemoglobins and a thirteen-cycle, eleven-amino-acid repeat in human apolipoprotein A-I. Genetics 86(3), 623–624 (1977)
Fujita, P.A., et al.: The UCSC Genome Browser database: update 2011. Nucleic Acids Research 39(D), D876–D882 (2011)
Gibbs, R.A., et al.: Evolutionary and biomedical insights from the rhesus macaque genome. Science 316(5822), 222–224 (2007)
Harris, R.: Improved pairwise alignment of genomic DNA. PhD thesis, Pennsylvania State University (2007)
Hasegawa, M., Kishino, H., Yano, T.: Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. Journal of Molecular Evolution 22(2), 160–164 (1985)
Huntley, S., et al.: A comprehensive catalog of human KRAB-associated zinc finger genes: insights into the evolutionary history of a large family of transcriptional repressors. Genome Research 16(5), 669–677 (2006)
Kent, W.J., et al.: Evolution’s cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc. Natl. Acad. Sci. USA 100(20), 11484–11489 (2003)
Lajoie, M., Bertrand, D., El-Mabrouk, N.: Inferring the evolutionary history of gene clusters from phylogenetic and gene order data. Molecular Biology and Evolution 27(4), 761–762 (2010)
Lajoie, M., Bertrand, D., El-Mabrouk, N., Gascuel, O.: Duplication and inversion history of a tandemly repeated genes family. Journal of Computational Biology 14(4), 462–468 (2007)
Ma, J., Ratan, A., Raney, B.J., Suh, B.B., Miller, W., Haussler, D.: The infinite sites model of genome evolution. Proc of the National Academy of Science USA 105(38), 14254–14261 (2008a)
Ma, J., Ratan, A., Raney, B.J., Suh, B.B., Zhang, L., Miller, W., Haussler, D.: DUPCAR: reconstructing contiguous ancestral regions with duplications. Journal of Computational Biology 15(8), 1007–1007 (2008b)
Ma, J., Zhang, L., Suh, B.B., Raney, B.J., Burhans, R.C., Kent, W.J., Blanchette, M., Haussler, D., Miller, W.: Reconstructing contiguous regions of an ancestral genome. Genome Research 16(12), 1557–1565 (2006)
Moret, B.M., Wang, L.S., Warnow, T., Wyman, S.K.: New approaches for reconstructing phylogenies from gene order data. Bioinformatics 17(S1), S165–S173 (2001)
Nadeau, J.H., Taylor, B.A.: Lengths of chromosomal segments conserved since divergence of man and mouse. Proceedings of the National Academy of Science USA 81(3), 814–818 (1984)
Schmidt, D., Durrett, R.: Adaptive evolution drives the diversification of zinc-finger binding domains. Molecular Biology and Evolution 21(12), 2326–2329 (2004)
Schwartz, S., et al.: Human-mouse alignments with BLASTZ. Genome Research 13(1), 103–107 (2003)
Shamir, R., Sharan, R., Tsur, D.: Cluster graph modification problems. Discrete Applied Mathematics 144(1-2), 173–182 (2004)
Song, G., Zhang, L., Vinar, T., Miller, W.: CAGE: combinatorial analysis of gene-cluster evolution. Journal of Computational Biology 17(9), 1227–1232 (2010)
Van Dongen, S.: Graph clustering via a discrete uncoupling process. SIAM Journal on Matrix Analysis and Applications 30, 121 (2008)
Vinar, T., Brejova, B., Song, G., Siepel, A.C.: Reconstructing histories of complex gene clusters on a phylogeny. Journal of Computational Biology 17(9), 1267–1279 (2010)
Zhang, J.: Evolution by gene duplication: an update. Trends in Ecology and Evolution 18(6), 292–298 (2003)
Zhang, Y., Song, G., Vinar, T., Green, E.D., Siepel, A., Miller, W.: Evolutionary history reconstruction for Mammalian complex gene clusters. Journal of Computational Biology 16(8), 1051–1060 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Brejová, B., Burger, M., Vinař, T. (2011). Automated Segmentation of DNA Sequences with Complex Evolutionary Histories. In: Przytycka, T.M., Sagot, MF. (eds) Algorithms in Bioinformatics. WABI 2011. Lecture Notes in Computer Science(), vol 6833. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23038-7_1
Download citation
DOI: https://doi.org/10.1007/978-3-642-23038-7_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23037-0
Online ISBN: 978-3-642-23038-7
eBook Packages: Computer ScienceComputer Science (R0)