Abstract
Several works have pointed out that the tight correlation between genes’ evolutionary rate is better explained by a model denoted as the Universal Pacemaker (UPM) rather than by a simple rate constancy as manifested by the classical hypothesis of Molecular Clock (MC). Under UPM, the relative evolutionary rates of all genes remain nearly constant whereas the absolute rates can change arbitrarily according to the pacemaker ticks. This evolutionary framework was recently adapted to model epigenetic aging where methylated sites are the analogs of evolving genes.
A consequent question to the above finding is the determination of the number of such pacemakers and which gene adheres to which pacemaker. This however turns to be a non trivial task and is affected by the number of variables, their random noise, and the amount of available information. To this end, a clustering heuristic was devised exploiting the correlation between corresponding edge lengths across thousands of gene trees. Nevertheless, no theoretical study linking the relationship between the affecting parameters was done.
We here study this question by providing theoretical bounds, expressed by the system parameters, on probabilities for positive and negative results. We corroborate these results by a simulation study that reveals the critical role of the variances.
Supported in part by the VolkswagenStiftung grant, project VWZN3157.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
We use the acronym “UPM” to refer to the model and “PM” to the pacemaker as a natural/combinatorial object.
References
Bromham, L.: Why do species vary in their rate of molecular evolution? Biol. Lett. 5(3), 401–404 (2009)
Bromham, L.: The genome as a life-history character: why rate of molecular evolution varies between mammal species. Philos. Trans. Roy. Soc. B: Biol. Sci. 366(1577), 2503–2513 (2011)
Mouse Genome Sequencing Consortium: Initial sequencing and comparative analysis of the mouse genome. Nature 20, 520–562 (2002)
Deming, W.: Statistical Adjustment of Data. Wiley, Hoboken (1943)
Drummond, D.A., Wilke, C.O.: Mistranslation-induced protein misfolding as a dominant constraint on coding-sequence evolution. Cell 134(2), 341–352 (2008)
Duchne, S., Ho, S.Y.W.: Mammalian genome evolution is governed by multiple pacemakers. Bioinformatics 31, 2061–2065 (2015)
Duchne, S., Ho, S.Y.: Using multiple relaxed-clock models to estimate evolutionary timescales from DNA sequence data. Mol. Phylogenet. Evol. 77, 65–70 (2014)
Duchne, S., Molak, M., Ho, S.Y.W.: ClockstaR: choosing the number of relaxed-clock models in molecular phylogenetic analysis. Bioinformatics 30(7), 1017–1019 (2014). https://doi.org/10.1093/bioinformatics/btt665
Grishin, N.V., Wolf, Y.I., Koonin, E.V.: From complete genomes to measures of substitution rate variability within and between proteins. Genome Res. 10(7), 991–1000 (2000). https://doi.org/10.1101/gr.10.7.991. http://genome.cshlp.org/content/10/7/991.abstract
Hartigan, J.A., Wong, M.A.: A k-means clustering algorithm. Appl. Stat. 28, 100–108 (1979)
Ho, S.Y.W., Lanfear, R.: Improved characterisation of among-lineage rate variation in cetacean mitogenomes using codon-partitioned relaxed clocks. Mitochondrial DNA 21(3–4), 138–146 (2010)
Horvath, S.: DNA methylation age of human tissues and cell types. Genome Biol. 14(10), 1–20 (2013). https://doi.org/10.1186/gb-2013-14-10-r115
Kimura, M.: Molecular evolutionary clock and the neutral theory. J. Mol. Evol. 26, 24–33 (1987)
Lanfear, R., Calcott, B., Ho, S.Y.W., Guindon, S.: PartitionFinder: combined selection of partitioning schemes and substitution models for phylogenetic analyses. Mol. Biol. Evol. 29(6), 1695–1701 (2012). https://doi.org/10.1093/molbev/mss020
Snir, S., Wolf, Y., Koonin, E.: Universal pacemaker of genome evolution. PLoS Comput. Biol. (in Press)
Snir, S.: On the number of genomic pacemakers: a geometric approach. Algorithm. Mol. Biol. 9, 26 (2014). Extended abstract appeared in WABI 2014
Snir, S., Pellegrini, M.: An epigenetic PaceMaker is detected via a fast conditional EM algorithm. Epigenomics (2018, accepted)
Snir, S., vonHoldt, B.M., Pellegrini, M.: A statistical framework to identify deviation from time linearity in epigenetic aging. PLoS Comput. Biol. 12(11), 1–15 (2016). https://doi.org/10.1371/journal.pcbi.1005183
Snir, S., Wolf, Y.I., Koonin, E.V.: Universal pacemaker of genome evolution in animals and fungi and variation of evolutionary rates in diverse organisms. Genome Biol. Evol. 6(6), 1268–1278 (2014)
Wolf, Y.I., Novichkov, P.S., Karev, G.P., Koonin, E.V., Lipman, D.J.: The universal distribution of evolutionary rates of genes and distinct characteristics of eukaryotic genes of different apparent ages. Proc. Nat. Acad. Sci. 106(18), 7273–7280 (2009)
Wolf, Y.I., Snir, S., Koonin, E.V.: Stability along with extreme variability in core genome evolution. Genome Biol. Evol. 5(7), 1393–1402 (2013)
Zuckerkandl, E., Pauling, L.: Molecules as documents of evolutionary history. J. Theoret. Biol. 8(2), 357–366 (1965)
Acknowledgments
We would like to thank Eugene Koonin and Yuri Wolf for inspiring the question, and Ilan Newman and Nick Harvey for helpful discussions. We also thank helpful and meticulous comments of the anonymous reviewers, used to clarify exposition. Part of this work was done while the author was visiting the NIH, USA, supported by Intramural funds of the US Department of Health and Human Services.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Snir, S. (2018). Bounds on Identification of Genome Evolution Pacemakers. In: Zhang, F., Cai, Z., Skums, P., Zhang, S. (eds) Bioinformatics Research and Applications. ISBRA 2018. Lecture Notes in Computer Science(), vol 10847. Springer, Cham. https://doi.org/10.1007/978-3-319-94968-0_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-94968-0_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-94967-3
Online ISBN: 978-3-319-94968-0
eBook Packages: Computer ScienceComputer Science (R0)