The Sinbad retrotransposon from the genome of the human blood fluke, Schistosoma mansoni, and the distribution of related Pao-like elements
- 7.3k Downloads
Of the major families of long terminal repeat (LTR) retrotransposons, the Pao/BEL family is probably the least well studied. It is becoming apparent that numerous LTR retrotransposons and other mobile genetic elements have colonized the genome of the human blood fluke, Schistosoma mansoni.
A proviral form of Sinbad, a new LTR retrotransposon, was identified in the genome of S. mansoni. Phylogenetic analysis indicated that Sinbad belongs to one of five discreet subfamilies of Pao/BEL like elements. BLAST searches of whole genomes and EST databases indicated that members of this clade occurred in species of the Insecta, Nematoda, Echinodermata and Chordata, as well as Platyhelminthes, but were absent from all plants, fungi and lower eukaryotes examined. Among the deuterostomes examined, only aquatic species harbored these types of elements. All four species of nematode examined were positive for Sinbad sequences, although among insect and vertebrate genomes, some were positive and some negative. The full length, consensus Sinbad retrotransposon was 6,287 bp long and was flanked at its 5'- and 3'-ends by identical LTRs of 386 bp. Sinbad displayed a triple Cys-His RNA binding motif characteristic of Gag of Pao/BEL-like elements, followed by the enzymatic domains of protease, reverse transcriptase (RT), RNAseH, and integrase, in that order. A phylogenetic tree of deduced RT sequences from 26 elements revealed that Sinbad was most closely related to an unnamed element from the zebrafish Danio rerio and to Saci-1, also from S. mansoni. It was also closely related to Pao from Bombyx mori and to Ninja of Drosophila simulans. Sinbad was only distantly related to the other schistosome LTR retrotransposons Boudicca, Gulliver, Saci-2, Saci-3, and Fugitive, which are gypsy-like. Southern hybridization and bioinformatics analyses indicated that there were about 50 copies of Sinbad in the S. mansoni genome. The presence of ESTs representing transcripts of Sinbad in numerous developmental stages of S. mansoni along with the identical 5'- and 3'-LTR sequences suggests that Sinbad is an active retrotransposon.
Sinbad is a Pao/BEL type retrotransposon from the genome of S. mansoni. The Pao/BEL group appears to be comprised of at least five discrete subfamilies, which tend to cluster with host species phylogeny. Pao/BEL type elements appear to have colonized only the genomes of the Animalia. The distribution of these elements in the Ecdysozoa, Deuterostomia, and Lophotrochozoa is discontinuous, suggesting horizontal transmission and/or efficient elimination of Pao-like mobile genetic elements from some genomes.
KeywordsBacterial Artificial Chromosome Long Terminal Repeat Bacterial Artificial Chromosome Library Mobile Genetic Element Zinc Finger Motif
mobile genetic element
open reading frame
expressed sequence tag
long terminal repeat
bacterial artificial chromosome
Schistosoma mansoni, the African blood fluke and etiological agent of intestinal schistosomiasis, is endemic in numerous countries in Africa, the Middle East, the Caribbean and northeastern South America. The life cycle of S. mansoni involves parasitism of both humans and aquatic snails of the genus Biomphalaria. Cercariae, the infectious larvae, emerge from the snails into lakes and fresh water streams, where they initiate human infection by direct penetration of the skin. Within the infected person, the worms develop into male and female adults within the portal system blood vessels and mesenteric veins of the intestines. Eggs released from the female parasite into the blood traverse the intestinal wall and are passed out with the feces. Among the tropical diseases, schistosomiasis ranks second only to malaria in terms of morbidity and mortality  and has proved refractory to control by the more conventional public health approaches. No vaccine is yet available.
Mobile genetic elements (MGEs) represent a major force driving the evolution of eukaryotic genomes [2, 3, 4] and play an important role in the establishment of genome size . One of the major categories of MGEs is the long terminal repeat (LTR) retrotransposable element, i.e. the LTR retrotransposons and the retroviruses . These elements are of interest for their potential for horizontal transmission, as well as their ability to shed light on phylogenies of their host organisms when solely vertically transmitted. The genomes of schistosomes, blood flukes of the phylum Platyhelminthes, are estimated at ~270 megabase pairs (MB) per haploid genome , arrayed on seven pairs of autosomes and one pair of sex chromosomes [8, 9]. Both the evolution and size of this genome may be highly influenced by mobile genetic elements. Indeed, more than half of the schistosome genome appears to be composed of, or derived from, repetitive sequences, to a large extent from retrotransposable elements [10, 12]. Mobile genetic elements colonizing the genome of S. mansoni are of interest both for their potential in developing tools for schistosome transgenesis and for their influence on the evolution and structure of the schistosome genome [13, 14]. Previously characterized schistosome MGEs include SINE-like retroposons [15, 16], long terminal repeat (LTR) retrotransposons [12, 17, 18], non-LTR retrotransposons [10, 11], and DNA transposons related to bacterial IS1016 insertion sequences . Boudicca, the first LTR retrotransposon characterized from the genome of S. mansoni  belongs to the gypsy -like retrotransposons, one of three highly divergent groups of LTR retrotransposons: the Gypsy/Ty3 group, the Copia/Ty1 group and the Pao/BEL group . Although active replication of schistosome retrotransposons has not been established, transcripts encoding reverse transcriptase (RT) and endonuclease are detectable [10, 11, 22], as is RT activity in parasite extracts , suggesting that at least some of these elements are actively mobile within the genome. Indeed, actively replicating MGEs have been described from other platyhelminths as RNA intermediates  and DNA transposons [25, 26]. Furthermore, the schistosome retrotransposons characterized so far are highly represented within the genome with copy numbers of up to 10,000 [10, 20].
It has been suggested that the Pao-like elements exhibit a host range limited to insects and nematodes . More recently, however, Pao-like sequences have been reported from vertebrates including the teleost fishes Takifugu rubripes and Danio rerio . Here we have characterized a new Pao-like element from the genome of S. mansoni, which we have named Sinbad after the mariner-explorer Sinbad from the classical Persian/Arabic tales of the "1001 Arabian Nights" (e.g., ). (Sinbad roved through near Eastern countries where schistosomiasis remains endemic even today .) Further, we investigated the phylogenetic distribution of Pao-like elements related to Sinbad and report that there is a discontinuous distribution of these elements throughout the Ecdysozoa, Deuterostomia, and Lophotrochozoa that suggests horizontal transmission and/or efficient elimination of Pao-like mobile genetic elements from some host genomes.
A LTR retrotransposon in BAC 33-N-3
Pao-like nucleoprotein, protease and reverse transcriptase
RNAse H and Integrase of Sinbad
As noted, the IN of Sinbad exhibited identity to Saci-1 from S. mansoni, and indeed these Pao-like retrotransposons from S. mansoni share substantial identity in deduced amino acid sequence and in structural organization . This similarity extended to several other domains including the Triple Cys-His box region of Gag, 32% identical (23/71, Fig. 2A); PR,32% identical (36/111, Fig. 2B); and RT, 45% identical (106/236, Fig. 3). Whereas these levels of sequence identity confirmed a close relationship between Sinbad and Saci-1, they also demonstrated that Sinbad and Saci-1 are distinct retrotransposons. Finally, Sinbad did not appear to encode an envelope protein, the retroviral gene product necessary for extracellular existence and infection .
Sinbad, a new Pao/BEL clade retrotransposon, is closely related to Pao and Ninja
In addition, a phylogram of IN sequences was assembled from 14 Pao/BEL family retrotransposons. The tree displayed the same general topography of branches as the RT-based phylogram and supported our suggestion that there are (at least) five discrete sub-families of BEL-Pao family retrotransposons: Tas-like, BEL-like, Pao-like, Sinbad/Saci-1-like, and Suzu-like (not shown; tree available from corresponding author). In similar fashion to the RT based tree, Sinbad and Saci-1 were closely related to each other and to the IN from the unnamed Pao-element from zebrafish (BK005571).
Copies of Sinbad interspersed throughout the schistosome genome
Estimation of gene copy number of the Sinbad LTR retrotransposon in the genome of Schistosoma mansoni.
Number of hits (Expect 0.000001)
Reported copy number
Cathepsin D, Intron 4
AY309267 (nt 3213–4849)
Sinbad-like elements transcribed in developmental stages of S. mansoni
BLASTn analyses were undertaken using the full length of Sinbad as the query sequence and the GenBank EST database of non-human, non-mouse sequences. The database includes more than 130,000 EST sequences from six developmental stages of S. mansoni – egg, miracidium, cercaria, germball (= sporocyst), schistosomulum, and mixed sex adults [39, 40]. Significant hits were found to ESTs from all of these six developmental stages. Of these, the hits with highest similarity to Sinbad, CD111741, CD060185, CD163413, CD062550, CD156994, and CD156946, exhibited contiguous ORFs spanning each EST without frameshifts or stop mutations. Positive ESTs spanning most or all of the LTR, gag, PR, RT, RH and/or IN regions were located in most of these six developmental stages, indicating that Sinbad-like elements are actively transcribed in all or most developmental stages of S. mansoni.
Discontinuous distribution of Sinbad-like elements
Organisms other than schistosomes with significant EST matches to Sinbad.
BLAST score (bits)
Ciona intestinalis (tunicate)
Molgula tectiformis (tunicate)
Srongylocentrotus purpuratus (purple sea urchin)
Drosophila melanogaster (fruit fly)
Bombyx mori (silk worm)
Salmo salar (Atlantic salmon)
Xenopus laevis (African clawed frog)
Trichinella spiralis (parasitic nematode)
Sinbad – a novel Pao/BEL family LTR retrotransposon from the genome of S. mansoni
Although several LTR retrotransposons have been characterized previously from the genome of S. mansoni, including Boudicca, Saci-1, Saci-2, Saci-3 and the fugitive [17, 20, 37], the Sinbad retrotransposon characterized here is a novel retrotransposon and it is discrete from these other elements. Sequence identity, structure, and phylogenetic relationships indicate that Sinbad is a member of the Pao/BEL family of retrotransposons. The hallmark structures included a triple Cys-His box zinc finger domain in the Gag polyprotein, protease with the active site tripeptide DSG, RT domain that included a YVDD active site motif, RNAseH with DAS at the active site, and an integrase domain with a DD(49)E spacing of the active site aspartic acid and glutamic acid residues. The YVDD motif of RT, a version of the F/YXDD consensus motif of Gypsy-like LTR retrotransposons, is shared by Pao and BEL. Bowen and McDonald  reported that the Cer7-Cer12 series of elements from C. elegans displayed YVDN at this site. Whether the Asn could replace Asp as the carboxy-residue of this conserved tetrapeptide with retention of enzyme activity remains to be determined by biochemical analysis, although mutation of either aspartate in YXDD of retroviral RT (HIV-1 or Moloney murine leukemia virus) inactivates the polymerase [see ].
The LTRs of Sinbad in BAC 33-N-3 are identical in sequence, and appeared to contain a putative promoter for initiation of transcription by RNA polymerase II. Along with conservation of most residues contributing to the active sites of the retrotransposon enzyme domains, these structural characteristics suggested that Sinbad is active or had been transpositionally active in the recent past. Several other features also indicated that Sinbad is transpositionally active. Numerous transcripts spanning enzymatic domains and LTRs of Sinbad, from at least six developmental stages of S. mansoni, have been sequenced , and of these, the ESTs most closely resembling Sinbad are composed entirely of contiguous open reading frames, suggesting non-mutated copies. On the other hand, potentially inactivating mutations, including stop codons and frameshifts, suggested that the BAC 33-N-3 copy of Sinbad was incapable of autonomous retrotransposition. If active copies are present, functional proteins coded by these copies could have been used in the recent past to mobilize the 33-N-3 Sinbad copy in trans, as recorded for other retrotransposons [45, 46, 47], explaining the presence of identical LTRs. Indeed, Frame et al.  noted that mutated copies framed by similar LTRs are common in BEL like elements in C. elegans, implying recent transposition.
A Sinbad/Saci-1 subfamily of Pao-BEL like LTR retrotransposons
Whereas the sequence and deduced structure of the three signature Pao-like elements, Pao from the silk moth B. mori, Tas from the human roundworm Ascaris lumbricoides and BEL from D. melanogaster have been known for about a decade, the Pao/BEL family is not as well understood or apparently as widespread as the other two major families of LTR retrotransposons, the Copia/Ty1 and the Gypsy/Ty3 families. However, at least three branches of the Pao/BEL family have become apparent – branches represented by Pao, BEL, and Suzu (from T. rubripes) [27, 28, 32, 49, 50, 51]. Using the new sequence information from Sinbad, and some related elements, we have been able to investigate the intra-family relations of the Pao/BEL elements more thoroughly. Our findings, based on phylogeny of RT, and supplemented by phylogeny of IN, indicated the presence of at least five sub-families of Pao/BEL elements. The majority of the sub-families may have a restricted host range; the Tas subfamily occurred only in nematodes (these elements may be endogenous retroviruses because they appear to include env genes), the BEL subfamily only in insects, the Pao subfamily only in insects, and the Suzu subfamily only in fishes. By contrast, the Sinbad/Saci-1 subfamily is known from schistosomes and zebrafish.
Phylogenetic range of Sinbad-like retrotransposons
The Pao/BEL retrotransposons are known only from animals, a less extensive distribution than those of the Copia/Ty1 or Gypsy/Ty3 groups that include elements known from fungi and/or plants as well as animals. The ostensible absence of these elements from prokaryotes, lower eukaryotes, fungi and plants suggests that ancestral Pao-like elements appeared after the differentiation of the Animalia. Though the number of sequenced entire genomes of animals is small, the distribution of Pao/BEL LTR retrotransposons within these few genomes displays a topography that we would not expect to be the result solely of vertical transmission alone (Fig. 7). Sinbad-like sequences were found in D. melanogaster, but not in D. pseudoobscura, nor in A. mellifera, even though close relatives are found in other insects such as B. mori and A. gambiae, and even in species as phylogenetically distant as D. rerio (a fish) and S. mansoni (a platyhelminth). Further, the distribution among chordates is enigmatic. Of the vertebrate whole genomes searched, only two, T. rubripes and D. rerio, were positive for Sinbad like elements. The human, mouse, rat, cow, chicken, pig and dog genomes were devoid of Sinbad-like matches. Since the genomes of lower chordates and a non-chordate deuterostome were positive for Sinbad-like sequences, progressive radiation would be expected to give rise to similar sequences in these vertebrates.
Feschotte  reported a similarly patchy distribution for the Merlin DNA transposons; Merlin like elements are abundant, for example, in anopheline mosquitoes but are absent from D. melanogaster, D. pseudoobscura, and A. mellifera. Also, they are present in some vertebrate genomes but not others. Merlin-like elements are also present in schistosome chromosomes. This type of distribution suggests that either the vertical lineage of the elements has been curtailed by the extinction of these elements from several genomes, or that horizontal transmission has taken place. Genomes need to restrain the uncontrolled proliferation of mobile genetic elements, especially retrotransposons, and indeed some eliminate mobile sequences more efficiently than others [5, 52]. Goodwin and Poulter  have shown that Ngaro elements have been lost from certain genomes, as evidenced by the presence of small, corrupt fragments serving as fossil sequences. Similarly, especially in view of the low number of Sinbad copies, Pao-like elements may have followed a course of progressive radiation followed by elimination from the Sinbad-negative genomes. However, if this were the case with Pao-like elements, relic sequences could be expected in at least some of the Sinbad-negative genomes. Their absence from mammalian and avian genomes favors the alternative explanation, that the current range reflects horizontal transmission.
What might have been the origin of the Pao/BEL radiation within the Animalia? Felder et al.  suggested that a common ancestor of Tas and Pao may have undergone a horizontal transmission event between the Insecta and Nematoda, followed by the eventual differentiation of these elements, including the gain or loss of env. Of the sub-families of Pao/BEL elements apparent in the RT-based phylogram (Figure 5), the Tas subfamily includes retrotransposons with an envelope encoding gene (specifically Tas from A. lumbricoides and Cer7 from C. elegans). The acquisition of an envelope protein by an ancestral Tas or Tas-like element would have enabled its extracellular existence and facilitated its horizontal transmission and infection of other hosts .
Interestingly, the deuterostomes bearing Sinbad-like sequences included a sea urchin, tunicates, pufferfish, zebrafish, the Atlantic salmon, and the African clawed frog X. laevis (Figure 7). These are aquatic species and, moreover, all are known from coastal or brackish waters at the interface of freshwater and marine systems. The secondary hosts of S. mansoni, snails of the pulmonate genus Biomphalaria, are also aquatic, as are the larval (miracidium and cercaria) stages of S. mansoni which enter and exit the snail. It will be of interest to determine whether or not Pao-like elements are present in this snail host, from which numerous RT-encoding sequences already have been reported . Also of potential relevance is that the genomes of both X. laevis and S. mansoni contain Pao-like elements and that X. laevis is the secondary host of the trematode parasite Tylodelphys xenopi , a fluke closely related to the human schistosomes. Both T. xenopi and another human schistosome, Schistosoma haematobium, use snails of the genus Bulinus as intermediate hosts. An aquatic lifestyle is an obvious relationship that links all of the deuterostome hosts of Sinbad-like elements. This aquatic, in comparison to a terrestrial, existence may have facilitated transmission of infectious particles of the Tas-like ancestors of Pao, Tas, BEL, Suzu, Sinbad, and relatives. Alternatively, schistosomes may have acquired a Tas- element directly from Ascaris lumbricoides, an exceedingly common human parasite and the host of Tas. A. lumbricoides occurs in the intestines of infected people, as do schistosome eggs, so direct transmission of a mobile genetic element from roundworm to schistosome could have been facilitated by their physical proximity within the human intestines.
A Pao/BEL like LTR retrotransposon named Sinbad is interspersed within the genome of the blood fluke, S. mansoni. About 50 copies of this element appear to reside in the S. mansoni genome. Analyses of the phylogenetic distribution of Pao/BEL-like retrotransposons indicated that Pao/BEL-like elements are present only within phyla of the Animalia, and not in prokaryotes, fungi or plants. Further, the analyses indicated that there are at least five discrete sub-families of the Pao/BEL clade of LTR retrotransposons, and that the distribution of these retrotransposons among the Ecdysozoa, Lophotrochozoa and deuterosomes has been influenced by horizontal as well as vertical transmission.
Screening the bacterial artificial chromosome library
Le Paslier et al.  described the construction and characterization of a bacterial artificial chromosome (BAC) library of the Schistosoma mansoni genome. The library, constructed in the plasmid vector pBeloBac11 with genomic DNA (gDNA) from cercariae of a Puerto Rican strain of S. mansoni partially digested with Hin d III, consists of 23,808 clones, about 21,000 of which are estimated to contain inserts ranging from 120 to 170 kb, providing ~8-fold coverage of the schistosome genome. Numerous BAC end sequences determined from randomly selected clones from this library are in the public domain. Inspection of the end sequence of BAC clone number 30-H-16 indicated identity with Pao-like LTR retrotransposons (not shown). Because the retrotransposon sequence was located at the end of the BAC, the clone was unlikely to contain the entire Pao-like element. Given that retrotransposons can be expected to be present in multiple copies in the host genome, we screened the library with a probe based on the end of BAC 30-H-16 in order to locate an entire copy of the retrotransposon. The gene probe was obtained by PCR amplification of a fragment of BAC 30-H-16 using the primers 5'-CGCGGATCCAAGAGAAAAACCTTGATAGAC and 5'-CCGGAATTCCTGTCGAAGATAAAAGAGC, was cloned into pBluescript and its identity confirmed by sequencing (Accession AY871176). This probe spanned residues 2457 to 2823 of the BAC 33-N-3 copy of the new retrotransposon (see below). The cloned insert was labeled with digoxygenin (DIG) and employed to screen the BAC library, as described , represented as high-density clone arrays on nylon membranes. Positive clones were cultured as described  and the presence of sequences with identity to the novel retrotransposon in the positive clones was confirmed by PCR (primers as above) or by colony hybridizations  to the DIG labeled probe. One positive clone, BAC 33-N-3, was investigated further by sequence analysis. BAC plasmid DNA was isolated from bacterial cultures using the PhasePrep BAC DNA purification system (Sigma). Analysis of the insert of 33-N-3 was accomplished after subcloning Bam H1 fragments of the BAC into pNEB 193 (New England Biolabs, MA), sequencing the inserts of the sub-clones, and also by direct sequencing of BAC 33-N-3. Automated nucleotide sequencing, using ABI BigDye Terminator chemistry (ABI, Foster City, CA) and an ABI Prism 3100 sequencer, was undertaken using primers specific for the probe and subsequently with gene specific primers at Tulane University and at Davis Sequencing (Davis, CA).
Sequence analysis and alignments
Contigs of the sequences were assembled using SeqMan (DNAstar, Inc., Madison, WI). Repeat sequences were identified with a Pustell style dot matrix  using the DotPlot3 program (Ramin Nakisa, Imperial College, London, UK) [see ] and the Pustell DNA Matrix function in MacVector (Accelrys). Amino acid alignments were accomplished with MacVector and ClustalW  using sequences from GenBank or using conceptual translations of nucleic acid sequences. Open reading frames were located and conceptually translated using MacVector. Sequences of the following retrotransposons were used in the multiple sequence alignments based on gag, protease, and reverse transcriptase: Ninja, T31674; Pao, S33901; MAX, CAD32253; Roo, AAN87269; BEL, AAB03640; and Saci-1, BK004068. Sequences of the following retrotransposons were used in the multiple sequence alignment based on Integrase: Saci-1, DAA04498;Pao, S33901; Ninja, T31674; Roo, AAN87269; Suzu, AF537216, BEL, AAB03640, Tas, Z29712, and MAX, CAD32253.
Parasite DNAs, Southern hybridization, densitometric estimation of copy number
Genomic DNAs of cercariae of a Puerto Rican strain of S. mansoni and of adults of a Chinese (Anhui Province) strain of S. japonicum were extracted using the AquaPure Genomic DNA Purification system (Bio-Rad, Hercules, CA). S. mansoni gDNA (30 μg/lane) and 33-N-3 BAC DNA (800 ng) were digested with Hin d III and Bam H I restriction enzymes, and S. japonicum gDNA (20 μg/lane) was digested with Hin d III. Digested gDNA and BAC DNA were size fractionated by electrophoresis through a 0.8% agarose gel, transferred to a nylon membrane (Zeta-Probe GT, Bio-Rad) by capillary action , and UV-light cross-linked to the membrane. Southern hybridization analysis to the DIG-labeled probe (above) was performed as described . Chemiluminescent signals were detected using X-ray film (Fuji). Densitometric analysis of Southern hybridization signals was accomplished using the Versa-Doc gel documentation system (Bio-Rad) and Quantity-One software (Bio-Rad). Densitometry values for signals evident in the gDNA and BAC DNA lanes were used to estimate the copy number for the new retrotransposon, Sinbad, according to the formula [(A/B) × C]/E = F. This formula was derived from two equations: (A/B) × C = D and D/E = F, where A was the number of copies of Sinbad in the BAC 33-N-3 lane, B was the density volume of the 33-N-3 lane in units of optical density per mm2, C was the density volume of the S. mansoni genomic DNA lanes in units of optical density per mm2, D was the total number of copies of Sinbad per genomic DNA lane, E is the number of haploid genomes in the gDNA lane, and F represented the copy number of Sinbad per haploid S. mansoni genome. The insert of 33-N-3 was estimated to be 145 kb in length and assumed to contain only a single copy of the retrotransposon.
Other copy number estimations
In addition to the densitometry-based estimate, estimates of the copy number of the Sinbad retrotransposon also were obtained by a comparative bioinformatics approach  wherein BLAST analysis of the bacterial artificial chromosome (BAC) -end database of S. mansoni genomic sequences targeted more well-characterized retrotransposable elements from S. mansoni for which copy numbers had been reported. These included the Boudicca LTR retrotransposon  and the non-LTR retrotransposons SR1 and SR2 [61, 62]. The NCBI database was searched by BLAST using the sequences of these mobile genetic elements and some other genes of S. mansoni, all of which included at least one Hin d III site. Specifically, the Advanced BLAST function was used, set to search only the S. mansoni sequences in the GSS database (Limit by Entrez Query: <Schistosoma mansoni[organism]>), and with the E value at 0.000001. The E value (Expect value) reflects the probability of obtaining a match purely by chance. Scores at or below this stringent cutoff E value of 10-6 were counted as positive. This exceptionally stringent cutoff value was used to minimize the chance of counting other Pao-like elements in the total copy number of Sinbad. Since the formula for E is based not only on the bit scores of the local alignment of each pair of sequences, but also on the lengths of the subject and query [see ], no additional correction was made for the length of the query sequence.
Phylogenetic analysis of Pao-like elements
Sequences for phylogenetic analysis comparing the RT region of several different retrotransposons were prepared by trimming sequences from the large single polyprotein of each retrotransposon to just the conserved domains of RT (see [21, 27]). Pol sequences presented in Xiong et al.  and Abe et al.  were trimmed exactly to the stretch of sequence shown by these authors to represent the RT domain. Other elements were aligned with these sequences and likewise trimmed to obtain just the RT domain. For some elements, nucleotide sequences were analyzed for open reading frames and translated before being trimmed to include just the 7 conserved blocks of the RT domain. Alignments were accomplished using Clustal X , after which bootstrapped trees (1,000 repetitions) were prepared using the neighbor joining method  and drawn with Njplot. The accession numbers for sequences included in the phylogenetic analysis are as follows: Ty3, S53577; Tas: Z29712; Suzu, AF537216; Sinbad, AY506538 (an N was inserted at position 2761 to a resolve a frameshift and generate a single ORF) Saci-1, DAA04498; Roo, AAN87269; Ninja, T31674; Moose, AF060859; Max, CAD32253; Kamikaze, AB042120; HIV-1, P04585; Gypsy, GNFFG1; Gulliver, AF243513; Copia, OFFCP; BEL, AAB03640; Cer7, AAB63932, Cer8, CAB04994, Cer9, CAB1647, and Cer11, AAA82437, two uncharacterized Anopheles gambiae retrotransposons, XP_309281 and XM_308737, an uncharacterized Caenorhabditis briggsae retrotransposon, AC084491, and two uncharacterized Danio rerio retrotransposons, BX537152 and BX005079 [see Additional file 2]. Two additional sequences were either not in the database or were composites made to reconstruct sequences more closely resembling non-mutated forms of the retrotransposons. The sequence representing Pao was a reconstruction prepared by Abe et al. , from accession numbers S33901, AB042118, and AB042119; the sequence representing Boudicca was a composite of translated cDNA sequences introduced in Copeland et al. , AY308018, AY308019, AY308021 and AY308022 [see Additional file 2].
Screening entire or partial genomes for Sinbad
A panel of fully or partially sequenced entire genomes was searched by BLAST for elements exhibiting sequence similarity to Sinbad. The deduced amino acid sequence encoding the region from the Cys-His Box through to the protease domain (encoded by nucleotides 106 to 753 of Sinbad [Y506538]) was employed as the query to search each genome individually using tBLASTn. The genomes searched in this way were as follows: Homo sapiens, Mus musculus, Rattus norvegicus, Takifugu rubripes, Danio rerio, Bos taurus, Gallus gallus, Sus scrofa, Canis familiaris, Anopheles gambiae, Apis mellifera, Drosophila melanogaster, Drosophila pseudoobscura, Brugia malayi, Caenorhabditis elegans, Caenorhabditis briggsae, Strongylocentrotus purpuratus, Ciona intestinalis, Ciona savigny, Giardia lamblia, Plasmodium falciparum, Plasmodium yoelii, Plasmodium berghei, Cryptosporidium parvum, Eimeria tenella, Theileria annulata, Toxoplasma gondii, Dictyostelium discoideum, Entamoeba histolytica, Leishmania major, Trypanosoma brucei, Trypanosoma cruzi, Arabidopsis thaliana, Avena sativa, Glycine max, Hordeum vulgare, Oryza sativa, Triticum aestivum, Zea mays, Lycopersicon esculentum, Schizosaccharomyces pombe, Saccharomyces cerevisiae, Saccharomyces paradoxus, Saccharomyces mikatae, Saccharomyces bayanus, Saccharomyces castelli, Saccharomyces kluyveri, Saccharomyces kudriavzevii, Neurospora crassa, Magnaporthe grisea, Aspergillus nidulans, Aspergillus fumigatus, Aspergillus terreus, Candida albicans, Coccidioides posadasii, Gibberella zeae, Coprinopsis cinerea, Cryptococcus neoformans, Ustilago maydis and Encephalitozoan cuniculi. In addition, 275 eubacterial and 21 Archaean genomes were searched [see Additional file 3]. Genomes with matches with E values less than 0.001 (corresponding approximately to bit scores greater than 40) were considered positive for Sinbad-like elements.
GenBank accession numbers
Sequences of the Sinbad LTR retrotransposon have been assigned accession numbers AY506537, AY506538, AY645721, AAT66412, and AY871176. Other sequences introduced here been assigned GenBank Third Party Annotation accession numbers; BK005570 (Danio rerio), BK005571 (D. rerio), BK005572 (Caenorhabditis briggsae), BK005573 (Anopheles gambiae), BK005574 (D. rerio).
Schistosome parasites were supplied by Dr. Fred Lewis through NIAID-NIH supply contract NO1-A1-55270. We thank Dr. Philip LoVerde for provision of the BAC library and the anonymous reviewers for helpful suggestions. This investigation received financial support from the Ellison Medical Foundation (Infrastructure Grant award ID-IA-0037-02). PJB is a recipient of a Burroughs Wellcome Fund Scholar Award in Molecular Parasitology.
- 10.Laha T, Brindley PJ, Smout MJ, Verity CK, McManus DP, Loukas A: Reverse transcriptase activity and untranslated region sharing of a new RTE-like, non-long terminal repeat retrotransposon from the human blood fluke, Schistosoma japonicum. Int J Parasitol. 2002, 32: 1163-74. 10.1016/S0020-7519(02)00063-2.CrossRefPubMedGoogle Scholar
- 18.Foulk BW, Pappas G, Hirai Y, Hirai H, Williams DL: Adenylosuccinate lyase of Schistosoma mansoni : gene structure, mRNA expression, and analysis of the predicted peptide structure of a potential chemotherapeutic target. Int J Parasitol. 2002, 32: 1487-1495. 10.1016/S0020-7519(02)00161-3.CrossRefPubMedGoogle Scholar
- 20.Copeland CS, Brindley PJ, Heyers O, Michael SF, Johnston DA, Williams DJ, Ivens A, Kalinna BH: Boudicca, a retrovirus-like, LTR retrotransposon from the genome of the human blood fluke, Schistosoma mansoni. J Virol. 2003, 77: 6153-6166. 10.1128/JVI.77.11.6153-6166.2003.PubMedCentralCrossRefPubMedGoogle Scholar
- 23.Ivanchenko M, Lerner JP, McCormick RS, Toumadje A, Allen B, Fischer K, Hedstrom O, Helmrich A, Barnes DW, Bayne CJ: Continuous in vitro propagation and differentiation of cultures of the intramolluscan stages of the human parasite Schistosoma mansoni. Proc Natl Acad Sci USA. 1999, 96: 4965-4970. 10.1073/pnas.96.9.4965.PubMedCentralCrossRefPubMedGoogle Scholar
- 27.Abe H, Ohbayashi F, Sugasaki T, Kanehara M, Terada T, Shimada T, Kawai S, Mita K, Kanamori Y, Yamamoto MT, Oshiki T: Two novel Pao -like retrotransposons (Kamikaze and Yamato) from the silkworm species Bombyx mori and B. mandarina: common structural features of Pao-like elements. Mol Genet Genomics. 2001, 265: 375-385. 10.1007/s004380000428.CrossRefPubMedGoogle Scholar
- 29.Burton RF, (Translator): The Arabian Nights, An Adult Selection. 1932, New York: The Modern LibraryGoogle Scholar
- 37.DeMarco R, Kowaltowski AT, Machado AA, Soares MB, Gargioni C, Kawano T, Rodrigues V, Madeira AM, Wilson RA, Menck CF, Setubal JC, Dias-Neto E, Leite LC, Verjovski-Almeida S: Saci-1, -2, and -3 and Perere, four novel retrotransposons with high transcriptional activities from the human parasite Schistosoma mansoni. J Virol. 2004, 78: 2967-78. 10.1128/JVI.78.6.2967-2978.2004.PubMedCentralCrossRefPubMedGoogle Scholar
- 40.Verjovski-Almeida S, DeMarco R, Martins EA, Guimaraes PE, Ojopi EP, Paquola AC, Piazza JP, Nishiyama MY, Kitajima JP, Adamson RE, Ashton PD, Bonaldo MF, Coulson PS, Dillon GP, Farias LP, Gregorio SP, Ho PL, Leite RA, Malaquias LC, Marques RC, Miyasato PA, Nascimento AL, Ohlweiler FP, Reis EM, Ribeiro MA, Sa RG, Stukart GC, Soares MB, Gargioni C, Kawano T, Rodrigues V, Madeira AM, Wilson RA, Menck CF, Setubal JC, Leite LC, Dias-Neto E: Transcriptome analysis of the acoelomate human parasite Schistosoma mansoni. Nat Genet. 2003, 35: 148-157. 10.1038/ng1237.CrossRefPubMedGoogle Scholar
- 48.Fan J, Brindley PJ: Retrotransposable elements in the Schistosoma japonicum genome. Ninth International Congress of Parasitology. 1998, Makuhari Messe, Chiba, Japan, 821-825.Google Scholar
- 51.Marsano RM, Marconi S, Moschetti R, Barsanti P, Caggese C, Caizzi R: MAX, a novel retrotransposon of the BEL-Pao family, is nested within the Bari 1 cluster at the heterochromatic h39 region of chromosome 2 in Drosophila melanogaster. Mol Genet Genomics. 2004, 270: 477-484. 10.1007/s00438-003-0947-7.CrossRefPubMedGoogle Scholar
- 54.Raghavan N, Miller AN, Gardner M, FitzGerald PC, Kerlavage AR, Johnston DA, Lewis FA, Knight M: Comparative gene analysis of Biomphalaria glabrata hemocytes pre- and post-exposure to miracidia of Schistosoma mansoni. Mol Biochem Parasitol. 2003, 126: 181-191. 10.1016/S0166-6851(02)00272-4.CrossRefPubMedGoogle Scholar
- 56.Vogeli G, Kaytes PS: Amplification, storage, and replication of libraries. Guide to Molecular Cloning Techniques. Edited by: Berger SL, Kimmel AR. 1987, San Diego: Academic Press, Inc, 53:Google Scholar
- 58.IUBio Software Archive. [http://iubio.bio.indiana.edu:7780/archive/00000029/]
- 66.Morales ME, Kalinna BH, Heyers O, Schulmeister A, Mann VH, Copeland CS, Loukas A, Brindley PJ: Genomic organization of the Schistosoma mansoni aspartic protease gene, a platyhelminth orthologue of mammalian lysosomal cathepsin D. Gene. 2004, 338: 99-109. 10.1016/j.gene.2004.05.017.CrossRefPubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.