Abstract
Spliceosomal introns are one of the principal distinctive features of eukaryotes. Nevertheless, different large-scale studies disagree about even the most basic features of their evolution. In order to come up with a more reliable reconstruction of intron evolution, we developed a model that is far more comprehensive than previous ones. This model is rich in parameters, and estimating them accurately is infeasible by straightforward likelihood maximization. Thus, we have developed an expectation-maximization algorithm that allows for efficient maximization. Here, we outline the model and describe the expectation-maximization algorithm in detail. Since the method works with intron presence–absence maps, it is expected to be instrumental for the analysis of the evolution of other binary characters as well.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Nixon JE, Wang A, Morrison HG, McArthur AG, Sogin ML, Loftus BJ, Samuelson J. A spliceosomal intron in Giardia lamblia. Proc Natl Acad Sci U S A 2002;99:3359–3361.
Vanacova S, Yan W, Carlton JM, Johnson PJ. Spliceosomal introns in the deep-branching eukaryote Trichomonas vaginalis. Proc Natl Acad Sci U S A 2005;102:4430–4435.
Simpson AG, MacQuarrie EK, Roger AJ. Early origin of canonical introns. Nature 2002;419:270.
Collins L, Penny D. Complex spliceosomal organization ancestral to extant eukaryotes. Mol Biol Evol 2005;22:1053–1066.
Lynch M., Richardson AO. The evolution of spliceosomal introns. Curr Opin Genet Dev 2002;12:701–710.
Roy SW, Gilbert W. The evolution of spliceosomal introns: patterns, puzzles and progress. Nat Rev Genet 2006;7:211–221.
Rogozin IB, Wolf YI, Sorokin AV, Mirkin BG, Koonin EV. Remarkable interkingdom conservation of intron positions and massive, lineage-specific intron loss and gain in eukaryotic evolution. Curr Biol 2003;13:1512–1517.
Roy SW, Gilbert W. Complex early genes. Proc Natl Acad Sci U S A 2005;102:1986–1991.
Roy SW, Gilbert W. Rates of intron loss and gain: implications for early eukaryotic evolution. Proc Natl Acad Sci U S A 2005;102:5773–5778.
Csuros M. Likely scenarios of intron evolution, Lecture Notes in Bioinformatics (McLysaght, A. and Huson, D., editors): Proc. RECOMB 2005 Comparative Genomics International Workshop (RCG 2005) 2005;3678:47–60.
Qiu WG, Schisler N, Stoltzfus A. The evolutionary gain of spliceosomal introns: sequence and phase preferences. Mol Biol Evol 2004;21:1252–1263.
Fedorov A, Roy SW, Fedorova L, Gilbert W. Mystery of intron gain. Genome Res 2003;13:2236–2241.
Cho S, Jin SW, Cohen A, Ellis RE. A phylogeny of caenorhabditis reveals frequent loss of introns during nematode evolution. Genome Res 2004;14:1207–1220.
Roy SW, Hartl DL. Very little intron loss/gain in Plasmodium: intron loss/gain mutation rates and intron number. Genome Res 2006;16:750–756.
Jeffares DC, Mourier T, Penny D. The biology of intron gain and loss. Trends Genet 2006;22:16–22.
Nguyen HD, Yoshihama M, Kenmochi N. New maximum likelihood estimators for eukaryotic intron evolution. PLoS Comput Biol 2005;1:e79.
Nei M, Chakraborty R, Fuerst PA. Infinite allele model with varying mutation rate. Proc Natl Acad Sci U S A 1976;73:4164–4168.
Uzzell T, Corbin KW. Fitting discrete probability distributions to evolutionary events. Science 1971;172:1089–1096.
Dibb NJ. Proto-splice site model of intron origin. J Theor Biol 1991;151:405–416.
Dibb NJ, Newman AJ. Evidence that introns arose at proto-splice sites. Embo J 1989;8:2015–2021.
Sverdlov AV, Rogozin IB, Babenko VN, Koonin EV. Reconstruction of ancestral protosplice sites. Curr Biol 2004;14:1505–1508.
Jordan IM (ed.). Learning in Graphical Models. Kluwer Academic Publishers, Boston, MA, 1998.
Jin L, Nei M. Limitations of the evolutionary parsimony method of phylogenetic analysis. Mol Biol Evol 1990;7:82–102.
Yang Z. Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: approximate methods. J Mol Evol 1994;39:306–314.
Dempster AP, Laird NM, Rubin DB. Maximum likelihood from incomplete data via the EM algorithm. J R Statist Soc B 1977;39:1–38.
Felsenstein J. Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol 1981;17:368–376.
Friedman N, Ninio M, Pe’er I, Pupko T. A structural EM algorithm for phylogenetic inference. J Comput Biol 2002;9: 331–353.
Siepel A, Haussler D. Phylogenetic estimation of context-dependent substitution rates by maximum likelihood. Mol Biol Evol 2004;21:468–488.
Castillo E, Gutierrez JM, Hadi AS. Expert systems and probabilistic network models (Monographs in Computer Science). Springer, New York, 1996.
Press WH, Flannery BP, Teukolsky SA, Vetterling WT. Numerical recipes in C: The art of scientific computing. Cambridge University Press, New York, 2nd ed., 1992.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Humana Press, a part of Springer Science+Business Media, LLC
About this protocol
Cite this protocol
Carmel, L., Rogozin, I.B., Wolf, Y.I., Koonin, E.V. (2009). A Maximum Likelihood Method for Reconstruction of the Evolution of Eukaryotic Gene Structure. In: Ireton, R., Montgomery, K., Bumgarner, R., Samudrala, R., McDermott, J. (eds) Computational Systems Biology. Methods in Molecular Biology, vol 541. Humana Press. https://doi.org/10.1007/978-1-59745-243-4_16
Download citation
DOI: https://doi.org/10.1007/978-1-59745-243-4_16
Published:
Publisher Name: Humana Press
Print ISBN: 978-1-58829-905-5
Online ISBN: 978-1-59745-243-4
eBook Packages: Springer Protocols