Molecular archeology of an SP100 splice variant revisited: dating the retrotranscription and Aluinsertion events
- 7.2k Downloads
SP100 is a nuclear protein that displays a number of alternative splice variants. In Old World monkeys, apes and humans one of these variants is extended by a retroprocessed pseudogene, HMG1L3, whose antecedent gene is a member of the family of high-mobility-group proteins, HMG1. This is one of only a few documented cases of a retropseudogene being incorporated into another gene as a functional exon. In addition to the HMG1L3 insertion, Old World monkey genomes also contain an Alu sequence within the last SP100-HMG intron. PCR amplification of the 3' end of the SP100 gene using genomic DNAs from human and New World and Old World monkey species, followed by direct sequencing of the amplicons has made dating the HMG1L3 and Alu insertion events possible.
PCR amplifications confirm that the HMG1L3 retrotransposition into the SP100 locus occurred after divergence of New World and Old World monkey lineages, some 35-40 million years ago. PCR amplification also shows that an upstream Alu sequence was inserted in the last SP100-HMG intron after divergence of the Old World monkey and ape lineages. Direct sequencing of the Alu in five Old World monkey species places the latter event at around 19 million years ago. Finally, ten single base mutations and one deletion in the Alu differentiate African from Asian Old World monkey species.
PCR and DNA sequence analysis of 'genetic fossils' such as retropseudogenes and Alu elements in primates give details as to the timing of such events and can reveal sequence features useful for other molecular phylogenetic applications.
KeywordsWorld Monkey Vervet Monkey Alternative Splice Variant Single Base Change SP100 Gene
Retroprocessed pseudogenes, or retropseudogenes, are reverse transcripts of mature mRNAs retrotransposed to new locales within the genome . Recently, these loci have received increasing attention . Goncalves et al.  have shown that retropseudogenes are quite common in mammalian genomes; 23,000 to 33,000 are estimated to reside in the human genome. Studies of both point mutations  and indels (insertions/deletions)  in retropseudogenes have shown them to be excellent sources of background genetic information in a wide range of species. Thus, one of the emerging utilities of retropseudogenes is their role in providing markers for phylogenetic studies between species or between populations within species [6,7,8,9,10].
Among the retropseudogenes studied to date, the high-mobility-group (HMG) pseudogene HMG1L3 is a member of a rare class in which all or part of the encoded protein is still expressed . Seeler et al.  reported that the nuclear protein SP100 displays a number of alternative splice variants. One of these, called SP100-HMG, is an 879 amino acid protein whose carboxy-terminal 170 residues bear a close similarity to the family of HMG proteins. Rogalla et al.  identified five retropseudogenes for which the antecedent gene is HMG1. Subsequently, Rogalla et al.  demonstrated that the carboxy-terminal extension of SP100-HMG is encoded by part of one of these HMG-1 retropseudogenes. Denoted HMG1L3, this retrotranscribed copy was inserted at the 3' end of the SP100 gene and has become incorporated into the 3' end of the SP100 locus as an exon, resulting in the addition of a DNA-binding function to the SP100 protein.
Rogalla et al.  performed a number of PCR amplifications using primer sequences from the 3' end of the SP100 locus. Different PCR primer combinations produced ampli-cons variously containing: the penultimate exon encoding a 14 amino acid joining region between SP100 and HMG1L3; the last SP100 intron; and the entire HMG1L3 pseudogene. Genomic DNA from human, chimpanzee, gorilla, gibbon and rhesus macaque was used in their study. Results suggest that the retro-transposition of HMG1L3 into the SP100 locus occurred at least 35 million years ago. In addition, a PCR amplicon produced from the rhesus macaque revealed the presence of an Alu sequence between the penultimate SP100 exon and the HMG1L3 insertion site that is not present in hominoid genomes. Here, I have used an expanded panel of New World and Old World monkey species to refine dating of both the HMG1L3 retrotransposition and the Alu insertion events.
Results and discussion
In support of the above suggestion, a third PCR primer, SP100-HMG3, was chosen from SP100 genomic sequence upstream of the 5' HMG1L3 insertion site. Amplification with this primer and a1PICdo yields a 292 bp amplicon in human and Old World monkey samples but no product in the New World monkey samples (Figure 2). Together, these results demonstrate that New World monkey species do not have HMG1L3, but that it is probably present throughout the Old World monkeys as well as ape and human (Hominoidea) genomes. Clearly, the reverse transcription and retrotransposition of HMG-1 that resulted in the creation of HMG1L3 occurred after divergence of Old World primate species (Catarrhini) from New World primates (Platyrrhini), but prior to the divergence from the Catarrhini of the lineage leading to apes and humans. Estimates of the origin and subsequent phylogenetic radiation of the Anthropoidea offered by Kay et al. , places these events in late Eocene to middle Oligocene, or between 30 and 40 million years ago.
Results illustrated in Figure 2 also show that the 300 bp Alu sequence found in the region between the penultimate exon of SP100 and the HMG1L3 insertion site in the genome of Macaca mulatta is present in the genomes of other Old World monkey species from Asia, the Indian subcontinent and Africa. Previous results  clearly show that the Alu is not present in any hominoid genome. Again, relying on the anthropoid phylogeny of Kay et al. , insertion of the Alu would have to have occurred after the divergence of the hominoids, or not more than 25 million years ago. An alternative view is that the Alu sequence insertion in SP100 occurred prior to the divergence of the hominoids, perhaps even at the same time as the HMG1L3 insertion, but that it was lost in the line leading to Hominoidea after divergence. However, the latter possibility is unlikely, for the following reasons: individual Alu sequences arise via unique insertion events; they are inserted in a sequence-independent manner into breaks in genomic DNA; and those breaks are subsequently repaired with the Alu embedded at the break point . Once inserted, Alu sequences remain stable features of the host genome . Although Alu sequences have been lost from host genomes, their excision is never as clean as their insertion. Either only part of the Alu sequence is lost or a loss of flanking genomic DNA occurs along with loss of the Alu sequence [18,19].
On the basis of these results, the most parsimonious scenario involves insertion of the Alu into the 3' region of the catarrhine SP100 gene and loss of the 22 base upstream sequence after hominid-catarrhine divergence between 20 and 25 million years ago. The most recent point at which these events might have occurred is 10 million years ago, the time at which the Cercopithecidae, represented by C. aethiops, and the Papionidae, represented by baboons and macaques, diverged [22,23]. This gives a window of 10-15 million years for the Alu insertion. Should members of the Colobinae, such as Colobus, Presbytis or Nasalis, have the Alu, the upper limit would be pushed back to 16-18 million years ago and restrict the insertion window to only 5-10 million years . Taking an estimate of 5 × 10-9 nucleotide substitutions per site per year for pseudogenes , mutations in the Alu sequences shown here suggest a date on the order of 19 million years ago for the insertion event. This is consistent with both the molecular and paleontologic data.
Materials and methods
Genomic DNA samples
Genomic DNA samples from New World and Old World monkey species were obtained through the generosity of a number of investigators. Human genomic DNAs were extracted from whole blood samples collected by the author under informed consent.
PCR amplification and amplicon sequencing
PCR primers were synthesized at Integrated DNA Technologies using standard phosphoramidite chemistry. Sequences PICauf1, 5'-TCTCTTCGATCTCCCTTTTCTG-3' and a1PICdo 5'-TCTTCCATGTCTCTGAGCACTTCT-3' were previously published . PCR conditions used for these primers are 94°C for 5 min, followed by 35 cycles of 94°C for 30 sec; 53°C for 30 sec; 72°C for 45 sec with a final extension of 72°C for 7 min. These amplifications are optimal at 1 mM MgCl2concentration. Other primers used in this study: SP100-HMG3, 5'-CAAGGGACATTACTTAAC-ACGAGG-3'; SP100-HMG4, 5'-GGATGGACTTGATCTCTTGACC-3'; and SP100-HMG5, 3'-AGTCATGACATAGTGTGCCTGG-3', were selected from SP100-HMG sequences deposited in GenBank (Accession numbers AF076675 and AF146342). Amplifications using SP100-HMG3 and a1PICdo were carried out under the same conditions as above with an annealing temperature of 55°C at 1.5 mM MgCl2and those involving SP100-HMG4 and SP100-HMG5 at an annealing temperature of 54°C at 1.5 mM MgCl2. Amplicons were resolved on 1.4% agarose gels.
PCR amplicons selected for sequencing were cloned into the TOPO-TA PCR cloning vector (Invitrogen, Carlsbad, USA). Sequencing was performed in both directions on an Applied Biosystems Model 310 Automated Fluorescence Sequencer.
I thank Moses Schanfield, Edward Max, Boris Lapin and the Southwest Foundation for Biomedical Research for their generosity in providing genomic DNA samples. Amplicon sequencing was carried out by Susanna Rezikyan at IDT.
- 7.Devor EJ: Use of molecular beacons to verify that the serine hydroxy-methyltransferase pseudogene SHMT-ps1 is unique to the order Primates. Genome Biol. 2001, 2: research0006.1-0006.5. 10.1186/gb-2001-2-2-research0006.Google Scholar
- 12.Seeler JS, Marchio A, Sitterlin D, Transy C, Dejean A: Interaction of SP100 with HP1 proteins: A link between the promyeloctyic leukemia-associated nuclear bodies and the chromatin compartment. Proc Natl Acad Sci USA. 1998, 95: 7316-7321. 10.1073/pnas.95.13.7316.PubMedPubMedCentralCrossRefGoogle Scholar
- 24.Szalay FS, Delson E: Evolutionary History of the Primates. New York: Academic Press;. 1979Google Scholar