The universe of exons revisited
We study the distribution of exons in eukaryotic genes to determine whether one can detect the reuse of exon sequences and to use the frequency of such reuse to estimate how many ancestral exon sequences there might have been. We use two databases of exons. One contained 56,276 internal exons from putatively unrelated genes (less than 20% sequence identity) and the second contained 8917 internal exons from regions of these genes that are homologous and colinear with prokaryotic genes; these are ancient conserved regions (ACRs). At the 95% significance level we find 3500 exon-sequence matches in the large database and 500 matches in the ACR database. These matches correspond to groups of similar sequences. The size-rank relationship for these groups follows a power law, the size falling off as the inverse square root of the rank. This form of the power law distribution leads us to make an estimate for the size of a possible universe of ancestral exons. Using the data corresponding to the ACR regions, that universe is estimated to be about 15,000–30,000 in size.
Key wordsACR BLAST evolution exon gene-structure intron
Unable to display preview. Download preview PDF.
- Zipf, G.K., 1949. Human Behavior and the Principle of Least Effort. Addison-Wesley, Redwood City, CA.Google Scholar