Background

Ejaculated spermatozoa represent terminally differentiated cells in which transcription and/or translation of nuclear encoded mRNAs are considered to be unlikely. Therefore, the paternal genome is the only cargo carried by the spermatozoa to the ooplasm. Discovery of several transcription factors, soluble signaling molecules and structures delivered by spermatozoan into the zygotic cytoplasm upon fertilization have changed this perception [13]. Despite the transcriptionally inactive state, spermatozoa retain an entourage of mRNA transcripts encoding transcription factors and proteins involved in signal transduction, cell proliferation, chromatin condensation, regulation of sperm motility, capacitation and acrosome reaction [27]. Delivery of such spermatozoal transcripts to ooplasm is envisaged to have their potential significance during fertilization, embryogenesis and morphogenesis. Approximately 3000–5000 mRNA transcripts have been suggested to be present in the spermatozoa [27]. However, these remain to be characterized for their organization, expression and association with different regulatory elements and repetitive sequences.

Repetitive DNA sequences are dynamic components of the genome encompassing transposable elements, major satellites, minisatellites and microsatellites [8, 9]. Most of these repeats are found in the non-coding regions of the genomes while a small fraction is retained within the transcriptome [1012] and participate in gene regulation through transcription, translation or gene silencing [1315]. Surprisingly, organization of the repeats within the transcriptomes of any mammalian species, particularly in the spermatozoa has remained undeciphered. To explore the organization of such repeats and the repeat tagged genes, we undertook the transcriptome analysis of water buffalo Bubalus bubalis, an important player in the agriculture, dairy and meat industries in the Indian sub-continent.

Minisatellite 33.15 (5' CACCTCTCCACCTGCC 3') originating from the human myoglobin gene (7q35-q36) has been studied in a number of species [1618]. This repeat has also been found to be associated with the heterochromatic sequences of the human Y chromosome [19]. In our previous in-silico study, we demonstrated presence of minisatellite 33.15 within the transcriptomes of several eukaryotes. Following this, we isolated and characterized several known and novel mRNA transcripts tagged with the consensus of 33.15 repeat loci from different somatic tissues and gonads of water buffalo Bubalus bubalis [20].

Owing to the envisaged participation of the spermatozoal mRNA transcripts during early zygotic and embryonic development, we studied the spermatozoal transcriptome of water buffalo Bubalus bubalis tagged with minisatellite 33.15 employing Minisatellite associated sequence amplification (MASA). These mRNA transcripts were further characterized for their sequence organization, homology status, variation of gene expression, copy number and evolutionary status of the corresponding genes.

Results

Differential distribution of the consensus sequence of 33.15 repeat loci within the spermatozoal and somatic transcriptomes

In previous study, we identified 6 different mRNA transcripts from various tissues of water buffalo using oligos based on the consensus of 33.15 repeat loci [20]. In the present study, we used this oligo (5' CACCTCTCCACCTGCC 3') for MASA and uncovered a total of 72 amplicons comprising 12 different mRNA transcripts from the spermatozoa (Figure 1). Cloning and sequencing followed by the BLAST search of these transcripts showed that 1 had no homology with the sequences in the GenBank and 10 were similar to uncharacterized BAC clones originating from cattle Bos taurus, human Homo sapiens and mouse Mus musculus. Only one mRNA transcript was found to be homologous to the Bovine steroid 21-hydroxylase (P-450-c21) gene (Table 1). Interestingly, comparative analysis of these spermatozoal mRNA transcripts showed that none was common to those identified from different somatic tissues, testis and ovary in earlier study [20]. However, cloning and sequencing of ~20 recombinant clones corresponding to each of these 72 fragments from seven different animals demonstrated their consistent presence in the spermatozoa. In order to ascertain the structural, functional and regulatory status of the MASA uncovered genes/gene fragments, we conducted a comprehensive database search for each mRNA transcript independently. The details of these homologous genes, their accession numbers with chromosomal locations (if available), species, position of our cDNA sequences and their possible functions are given in the table 1.

Table 1 Detailed analysis of the spermatozoal mRNA transcripts tagged with consensus of 33.15 repeat loci in buffalo Bubalus bubalis and their homology or similarity with genes across the species#
Figure 1
figure 1

Minisatellite associated sequence amplification (MASA) performed using oligo based on the consensus of 33.15 repeat loci and cDNA from the spermatozoa of 7 different animals. Four to seven transcripts in the range of 0.15 kb to 0.5 kb were detected (A). β-actin was used as an internal control with cDNA from all the samples (B).

Conserved 33.15 tagged spermatozoal genes are present as single to double copy in buffalo genome

To determine possible evolutionary significance of MASA uncovered genes/gene fragments, we conducted cross-hybridization studies using individual recombinant clone containing corresponding gene fragment with the gDNA of 13 different species (Figure 2). All the 33.15 tagged spermatozoal genes were found to cross hybridize with all the thirteen species suggesting their ubiquitous roles in the higher eukaryotes.

Figure 2
figure 2

Cross-hybridization of genomic DNA from different species with the recombinant clones corresponding to different gene fragments uncovered by MASA. The names of the species are given on top of the panel and recombinant clones corresponding to genes/gene fragments, on the left.

Following sequence analyses and conservation studies, the copy number status was calculated for these genes/gene fragments using standard curve method and SYBR green chemistry. Extrapolation of the standard curves obtained with 10 fold dilution series of the respective recombinant plasmids containing corresponding gene fragments (Additional file 1A–B) demonstrated their 1 to 2 copies per haploid genome in buffalo (Table 2).

Table 2 Relative quantitative expression and copy number status of the spermatozoal genes/gene fragments tagged with the consensus of 33.15 repeat loci

Differential expression profiles of 33.15 tagged mRNA transcripts

Comparative expression profiles of the uncovered genes were studied to explore their functional status in different somatic tissues, gonads and spermatozoa. The expression study for each MASA uncovered gene/gene fragments was done first by RNA slot blot hybridization (Not shown) and later by RT-PCR analysis (Figure 3). The quantitative expressional analysis was then performed for individual transcript using equal amount of cDNA templates and GAPDH as an internal control employing SYBR green assays in Real Time PCR.

Figure 3
figure 3

RT-PCR analyses for the 33.15 tagged mRNA transcripts using internal primers and cDNA from different somatic tissues, gonads and spermatozoa. Transcript IDs are given on the right and names of the tissues on top. Quality and quantity of the cDNA samples were normalized by PCR with β-actin derived primers. Tissue specificities of the transcripts were ascertained on the basis of presence or absence of the amplicons using corresponding cDNA templates which were further confirmed by Real Time PCR and Southern blotting.

The primer specificity for each individual gene was established using five fold dilution series of the template cDNA (Additional file 1C–D). Straight standard curve with a slope = (-3.4) to (-3.6) and a single peak in the dissociation protocol established the maximum specificity and efficiency of the primers. For more accuracy, the expression study was conducted using three different dilutions of cDNA. The relative quantitation of expression so obtained was substantiated further using somatic/gonadal/spermatozoal cDNA from five additional animals.

The relative expression study amongst all the somatic tissues, gonads and spermatozoa demonstrated the highest and/or unique expression of all the 12 transcripts in the spermatozoa including that of pJSC40 [Coding for buffalo steroid 21-hydroxylase (P-450-c21) gene, GenBank accession number: EU348480]. When expression was compared amongst different somatic and gonadal tissues, only one transcript showed uniform expression and the other one, highest expression in testis, liver, lung & brain. The remaining ten transcripts showed highest expression in the testis compared to that in other somatic tissues and ovary (Figure 4 and Table 2). Significantly, 4 (pJSC42, pJSC44, pJSC45 and pJSC46) of the 12 uncovered genes showed negligible expression in the forebrain but comparatively higher expression (18 to 326 folds) in the hindbrain of buffalo.

Figure 4
figure 4

Comparative expression of 33.15 tagged genes/gene fragments. Bars demonstrating expression levels across the tissues and spermatozoa of buffaloes represented in different colors corresponding to different mRNA transcripts. Note the highest expression of most of the fragments in the testis and/or spermatozoa.

Discussion

Minisatellites have been implicated with gene regulation, chromosomal fragile sites and genome imprinting [15, 21, 22] but the biological significance of their association with the coding regions remains largely unresolved. In the present study, we established the association of 33.15 repeats with the spermatozoal transcriptome of water buffalo, Bubalus bubalis and observed highest expression of these transcripts in testis and/or spermatozoa indicating their crucial roles in male gametogenesis.

Potential implications of association of repeats with the spermatozoal transcriptome

Our previous study showed distribution of 33.15 sequences in the flanking regions and within the mRNA transcripts of many structural, functional and regulatory genes in buffalo and six other species [20]. Previously, various 33.15 tagged mRNA transcripts were uncovered from somatic tissues and gonads of buffalo. Here, a total of 12 different mRNA transcripts, representing one known and others mostly uncharacterized ones from spermatozoa were identified (Figure 1 and Table 1) complementing our earlier study. Notably, none of the transcripts was shared between somatic/gonadal tissues and spermatozoa. The differential transcript profile may be a reflection of their diverse functions in somatic tissues, gonads (testis/ovary), and spermatozoa. Alternatively, they may have specific roles to play during different stages of development. These transcripts could not be picked up in somatic tissues/gonad and vice-versa due to either polymorphic nature of SSRs or their much lower numbers. A comparison of the mRNA transcripts uncovered in the present study with those reported earlier [3] confirmed that these were the different ones.

Thus, present study can be used as a basis for evaluation of other repeats to unveil their combined significance within and adjacent to the coding regions. Spermatozoa have come a long way from being regarded as an artifact to a repository of a variety of RNAs. The initial controversy which shrouded the existence of spermatozoal RNA pool has now been cleared by the identification of several transcripts in different species including human [27].

Many signaling molecules and transcription factors have been identified which are transported into the zygotic cytoplasm upon fertilization by the spermatozoa [27]. However, several other spermatozoal transcripts remain to be characterized. Present study seems to be the first one, unveiling existence of the minisatellite tagged transcripts in the buffalo spermatozoa suggesting their predominant involvements in the pre- and post-fertilization events. It becomes more significant, when these transcripts showed faithful conservation across thirteen different species suggesting their testicular functions in much broader number of higher eukaryotes.

Prospects of repeat tagged transcripts carrying highest expression in the testis and spermatozoa

All the 33.15 tagged genes/gene fragments showing highest expression in the testis and spermatozoa suggested their strong participations in the testicular development and spermatogenesis. Further, uniform expression of steroid 21-hydroxylase gene suggested its consistent requirement in all the tissues except spermatozoa. The steroid 21-hydroxylase; P450 (C21) belongs to the cytochrome P-450 super-gene family and plays a crucial role in the synthesis of steroid hormones such as cortisol and aldosterone. Mutations in this gene have been implicated in causing Adrenal Hyperplasia [1].

This gene has been reported to show maximum expression in adrenal glands and minimum in liver and testis. However, in our study, this gene showed maximum expression in spermatozoa thus, adding a new dimension towards its unidentified functions in the sperm development and maintenance. Expression of P450 (C21) gene could not be confirmed in the adrenal gland of buffalo due to unavailability of the tissue sample. Since the spermatozoal mRNA is delivered to the ooplasm, and has its potential roles in the chromatin repackaging, early zygotic and embryonic development and post syngamy events [2, 3, 5, 23], the observed highest expression of all the 33.15 tagged mRNA transcripts in spermatozoa suggest their important roles towards specific functions of paternal genome sustenance.

Conclusion

Present work demonstrated the association of consensus sequence of minisatellite 33.15 with buffalo spermatozoal transcriptome. Maximum expression of all the 33.15-tagged mRNA transcripts in testis and spermatozoa suggested their involvement in various testicular functions and fertilization. Following this approach, other repeat tagged mRNA transcripts may be studied within the buffalo genome and across the species. A comparative mRNA fingerprint based on several minisatellites resulting in detailed expression profile would allow the identification of common and uncommon transcripts in somatic tissues and gonads. Indeed, this would help understanding the more focused roles of repetitive DNA sequences in the paternal genome during and post fertilization events and different stages of embryonic development.

Methods

Sperm purification and RNA isolation

Fresh ejaculated semen samples of buffaloes were obtained from the local dairy farm following strictly the guidelines of the Institute's Bio-Ethical and Biosafety Committees. All the sperm samples were procured from the buffalo in-vitro fertilization (IVF) center where utmost care was taken to avoid the diploid cell contaminations in the sperm preparations using highly sensitive methodologies and modern techniques. Samples were subjected to percoll gradient method to select only motile sperms as described earlier [23]. Sperm cells were washed in sperm wash buffer (0.15 mM NaCl and 10 mM EDTA) and pellets were processed for RNA extraction using TRIzol reagent (Invitrogen, CA). After re-suspension of the cells in TRIzol reagent, cells were incubated at 60°C for 30 min as described earlier [23], vortexing every 10 min for lysis following manufacturer's instructions. Each RNA sample was treated with RNase free DNase I (Ambion, USA), and precipitated with ammonium acetate and ethanol following standard protocols [2325].

Quantification of RNA was done with spectrophotometer (Amersham Life Sciences). A fraction of each RNA sample (5 ng/μl) was tested for residual DNA contamination by PCR using primers against β-actin (ACTB; GenBank accession no. DQ661647) and Protamine-1 (PRM1; GenBank accession no. NM_174156) following standard procedures [23, 25] (Additional file 2). These exon-specific primers encompassed one intron, thus, distinguishing cDNA from the genomic sequence. The PCR was conducted using initial denaturation at 95°C for 5 min; 35 cycles of denaturation at 94°C for 30 sec, primer annealing at 55°C for PRM1 and 60°C for ACTB for 30 sec, extension at 72°C for 1 min; and final extension at 72°C for 10 min.

Isolation of genomic DNA, total RNA and cDNA synthesis

Blood and tissue samples of both the sexes of water buffalo were collected from local slaughterhouse, following guidelines of the Institute's Ethical and Bio-safety Committees. Details of the genomic DNA isolation from buffalo and other species used in this study for cross hybridization have been reported earlier [20, 26]. Total RNA was also isolated from Testis, Ovary, Spleen, Liver, Lung, Kidney, Heart and Brain using TRIzol reagent (Invitrogen, CA) as per the standard procedure [2325]. Following quantification of RNA, the cDNA synthesis was conducted using commercially available cDNA archive kit (ABI, USA). The quality of cDNA so prepared was confirmed by PCR amplification using β-actin and PRM1 primers. The absence of somatic and other non-sperm cells in each sperm RNA preparation was confirmed by PCR reactions using each cDNA sample, and common leukocyte antigen (CD45) and epithelial E-cadherin (CDH1) gene markers (Additional files 2 &3) at 55°C annealing temperature.

Minisatellite associated sequence amplification (MASA)

For conducting minisatellite associated sequence amplification (MASA), a 16 base long oligo (5' CACCTCTCCACCTGCC 3') based on consensus of 33.15 repeat loci, was purchased from Microsynth GmbH (Balgach, Switzerland). MASA reactions were performed using cDNA templates from the spermatozoa of different animals following standard procedures [20, 26]. The resultant amplicons were resolved on 20 cm long 2.5% (w/v) agarose gel using 0.5× TBE buffer.

Cloning, sequencing and characterization of MASA uncovered amplicons

From the MASA reactions, a total of 72 amplicons were uncovered (Table 1). These amplicons resolved on the agarose gel were sliced; DNA eluted (Qiagen Gel Extraction kit, Germany) and processed independently for cloning into pGEMT-easy vector (Promega, USA). A total of 20–25 recombinant clones corresponding to each amplicon were characterized by restriction digestion and slot blot hybridization using labeled buffalo genomic DNA [27]. Sequences of all the clones generated from every single amplicon were independently subjected to ClustalW alignment to ascertain interclonal variation. Finally, a total of 12 sequences were deposited in the NCBI GenBank and their accession numbers are given in the table 1.

Database search was conducted to determine homology of these sequences independently with other entries in the GenBank using default NCBI server http://blast.ncbi.nlm.nih.gov/Blast.cgi having megablast "highly similar" and blastn "somewhat similar" algorithms. Each sequence was first subjected to blast search across the databases of Nucleotide collection (nr/nt), Reference mRNA sequences (refseq_mRNA), NCBI genomes (chromosome) and Expressed sequence tags (ESTs) of all the organisms, and thereafter individually to those of cattle, sheep, goat and human.

Cross-hybridization of the uncovered genes/gene fragments

Cross hybridization studies were conducted using DNA from across the species. DNA was extracted from the peripheral blood of buffalo Bubalus bubalis, cattle Bos indicus, sheep Ovis aries, goat Capra hircus, human Homo sapiens, Pigeon Columba livia, pig Sus scrofa, Baboon Papio hamadryas, Bonnet monkey Macaca radiata, Langur Presbytis entellus, Rhesus monkey Macaca mulatta, Lion Panthera leo, Tiger Tigris tigris following standard protocols [25, 26]. Lion and Tiger blood samples were procured with due approval of the competent authorities of the States and Union Government of India. Hybridization of genomic DNA from different sources using recombinant cloned probes was conducted following standard procedures [25, 27].

RNA slot blot analysis, RT-PCR and Southern Blotting

For RNA slot blot analysis, approximately 2 μg of total RNA from different tissues of buffalo in 100 μl of 2 × SSC was slot blotted onto the nylon membrane (Minifold Apparatus, Schleicher & Schuell, Germany) and UV fixed. For positive control, 5 ng of recombinant plasmid, each, was included in the blot(s). Hybridizations were performed under high stringent conditions using standard procedure [25, 26]. Individual probes for each fragment was labeled with [32P] α-dCTP using rediprime™ II kit (Amersham Pharmacia Biotech, USA). In order to confirm the RNA slot blot, internal primers were designed from each fragment (Additional file 4) and RT-PCR was conducted using cDNA templates from different tissues following standard thermal profiles (Additional file 4). The products were resolved on the agarose gel and transferred onto the nylon membrane followed by hybridization with [32P] α-dCTP labeled recombinant clones corresponding to each uncovered fragment using standard procedures [2527]. Bubaline derived β-actin gene probe and bacterial genomic DNA were used as positive and negative controls, respectively.

Relative Expression Studies Using Real Time PCR

For expression quantification of different genes/fragments (JS1–JS12), SYBR green assays were conducted using Real Time PCR (Sequence Detection System, 7500, ABI) for individual fragments using equal amount of cDNA from all the tissues and spermatozoa. The GAPDH and ACTB genes were used as endogenous controls for relative expression. The primers for assessing relative expression of each of the transcripts (JS1–JS12, GAPDH &ACTB) and the copy numbers of their corresponding genes in the haploid genome were designed using Primer Express 3.0 software (Applied Biosystems, USA). The primer details have been given in additional file 4. Standard curves for all the genes were composed of five points covering 10 log orders of magnitude using a serial dilution of a purified cDNA template. Slopes of these standard curves were obtained between -3.3 and -3.6, reflecting amplification kinetics ranging from 80% to 100%.

Melting-curve analysis was performed for SYBR Green amplifications by plotting fluorescence intensity in a graphic model. To validate the specificity of the qRT-PCR assays, a single melting temperature peak representing a single amplicon (SYBR Green amplifications) was confirmed. Thereafter, the quantitative RT-PCR assay was performed individually for JS1–JS12, GAPDH and ACTB genes (Additional file 4) using SYBR Green chemistry containing 5 μl of 2× Power SYBR Green Master Mix (Applied Biosystems), 0.3 μM of forward and reverse primers each, and 0.25 U of AmpErase UNG (Applied Biosystems). All the runs included a triplicate of the melting curve for each sample and from three to six negative controls (absence of target DNA) that were partitioned to detect potential contamination during preparation of the plate. Thermal cycling conditions used were: 50°C for 2 min (incubation for the AmpErase UNG) and then the first denaturation step at 95°C for 10 min, followed by 40 cycles of 95°C for 15 sec and 60°C for 1 min. Further, each experiment was repeated three times at different concentrations of the template to ensure consistency of the results. Expression level of the genes was calculated using the formula: expression status = (1+E)-ΔΔCt, where E is the efficiency of the PCR and ΔΔCt is the cycle threshold normalized first with the endogenous control GAPDH (Ct sample - Ct GAPDH = ΔCt) and then with the calibrator sample (ΔCt Sample - ΔCt Calibrator = ΔΔCt) [23, 25, 26].

Copy number calculation of the 33.15 tagged genes using Real Time PCR

The copy number status of the genes/gene fragments corresponding to each 33.15 tagged transcripts was calculated based on absolute quantification assay using SYBR green chemistry. The primers standardized and checked for the quantitative expression for JS1–JS12 were employed also for copy number calculation (Additional file 4). The gene copy number of individual genes/gene fragments corresponding to the transcripts tagged with minisatellite 33.15 was calculated using standard curve method. A ten fold dilution series of known copies of each recombinant clone (for individual JS1–JS12) was used to plot standard curve with slope -3.3 to -3.4 substantiating maximum efficiency of the real time PCR reaction. Extrapolation of standard curves for individual recombinant clones uncovered copy number status of each 33.15 tagged gene in the haploid buffalo genome.