Introduction

Fertilization reprograms the human egg and sperm from non-dividing, fully differentiated cells into totipotent, cleaving blastomeres. Totipotency persists for three or four cell doublings, from the single cell zygote to the 8- to16-cell morula. The morula traps one or two cells inside, giving rise to an inner group of cells (the inner cell mass, ICM) and an outer layer of trophoblast cells [1]. Once the embryo has reached the 32- to 64-cell stage, the trophoblast cells pump water and nutrients into the interior of the ball, forming a blastocyst, within which the ICM cells continue to proliferate. It is at the blastocyst stage that ICM cells are harvested for the derivation of embryonic stem (ES) cells [2, 3].

Early cleavage divisions are supported by proteins and messenger RNAs stockpiled in the egg, with new gene expression detected at the four- to eight-cell stage in the human [4]. Hence, the 8-Cell embryo is a unique totipotent stage in the human, beginning to guide its own development. For survival, it must quickly give rise to the critical mass of healthy cells needed to signal the mother that it is developing; failure to do so results in menses within a few days. The totipotent 8-Cell may therefore function independent of outside stimuli, and be enriched for cell cycle and chromosome replication machinery that are designed for perfection.

Characterizing gene expression in normal human blastomeres will begin to reveal pathways essential to totipotency, as well as provide guidelines to distinguish viable from non-viable embryos to improve outcomes of assisted reproduction. Such information will also provide reference standards for eggs activated artificially to generate parthenote stem cells for therapeutic purposes [57]. Ethical considerations surrounding human embryo research necessitate experimental approaches that are accurate with extremely small quantities of starting material. We have taken advantage of newly developed methods for linear amplification of small quantities of mRNA [8] and improved whole human genome microarrays [911] to characterize gene expression in two groups of five human 8-cell embryos judged morphologically and by rate of cleavage to be normal and free of fragmentation. We report here the results for the gene elements involved in circadian rhythm and cell division, in comparison with the same analyses previously published for human embryonic stem (hES) cells [12], and human fibroblasts before and after induced pluripotency [13].

Methods and materials

Embryos and RNA extraction

Supernumerary embryos were donated by Greek couples undergoing assisted reproduction in the Department of Obstetrics and Gynecology, Athens Medical School, “Alexandra” Maternity Hospital, Athens, Greece. Because Alexandra Hospital has never had a program of embryo cryopreservation, and because Greek law limits embryos transferred to three per cycle, patients undergoing assisted reproduction receive minimal hormone stimulation, but nonetheless occasionally produce more than three normally cleaving embryos, which are routinely transferred to the patient at the four-cell stage, approximately 72 h after egg collection, and culture in Universal IVF medium (Medicult). The research protocol to utilize normal-appearing embryos, in excess of the three chosen for transfer, was reviewed and approved by the Institutional Review Boards of Alexandra Hospital and the Bedford Research Foundation.

Pilot studies with mouse embryos revealed that linear amplification for microarray analysis was most reproducibly achieved with purified RNAs from no fewer than 20 embryos. Since a human embryo is approximately four times the size of mouse, and to avoid individual embryo variations, we collected and amplified RNAs from two pools of five human embryos each, judged morphologically and by rate of cleavage to be normal. The two pools of five embryos, fertilized by intracytoplasmic sperm injection and cultured one additional day after the embryo transfer, were donated by nine couples, one couple donated one embryo to each pool, seven of the couples achieved a pregnancy. Embryos were transferred individually to 0.5 mL flip-top conical tubes in 2 uL of culture medium, flash frozen in liquid nitrogen, and shipped in liquid nitrogen dry shippers to the Bedford Research Foundation laboratory for RNA extraction, amplification and microarray analysis. Embryos were visualized with a dissecting microscope during the thaw process and transferred immediately into 10 uL of Arcturus PicoPure extraction buffer. RNA was isolated and DNAse treated according to manufacture’s instructions. RNA isolation in parallel from 20 frozen mouse embryos was analyzed in an Agilent Bioanalyzer and found to have 28 S to 18 S RNA peak ratios of 1.8 to 2.0.

Antisense mRNA amplification

8-Cell mRNAs were amplified according to a protocol previously reported to be linear for RNAs from 10 human eggs [8] and linear in pilot studies from 20 mouse embryos. Briefly, step one was reverse transcription in the presence of T4 Gene Protein with RNAse H-free MuLV using an oligo-dT [24] primer linked to T7 RNA polymerase binding site, in the presence of a second SMART IIA primer with polyG at the 3’ end to bind in the opposite orientation to the polyC 3’ end created by MuLV RT Powerscript, which then completes the strand extension. Step two was twenty cycles of PCR with Smart primers -1 and -2 in the PCR Advantage kit with polymerase AD2; amplified cDNAs were purified from the PCR reaction with Qiagen PCR Clean-up kit. Antisense cRNAs were amplified from the purified cDNAs by T7 in an overnight reaction, followed by purification with Arcturus PicoPure and analysis on the Agilent Bioanalyzer to verify amplification of cRNAs in the 7 S to 10 S size range.

Microarray analyses

cRNAs were shipped to MoGene (St. Louis, MO) for Cy3-labelling (Kreatech Kit) and overnight hybridization to Agilent whole human genome, 44 K microarray. Mouse embryo RNAs amplified and analyzed in parallel as controls were hybridized to the Agilent 44 K mouse development array. Hybridization intensities were normalized by Agilent Feature Extraction Software. RT-PCR verification revealed complete agreement with the microarrays, in keeping with results reported by the MicroArray Quality Control (MAQC) Project [11].

Normalized fluorescent intensities (fluorescence units, FUs) were imported into a FileMaker Pro database containing tables of published data (GEO, NCBI) for the same Agilent microarray analysis of human fibroblasts, human iPS cells [13], and two human ES cell lines, H9 and hES0 [12]. Gene elements were aligned by matching unique Agilent probe numbers. The combined database containing six datasets was used to determine statistical parameters for over— and under- expression, as well as comparison of patterns of gene expression among the 4 cell types. We first assessed differences between the two 8-Cell arrays to estimate differences in RNA amplification linearity between the embryo pools. The sums of FUs (184,359,915, 8-CellA; 195,617,837, 8-CellB) were similar, ratio 1.06. The average ratio for FUs for each of the 44 K gene elements on the 8-Cell arrays was 1.0 ± 3.0. The sums of FUs for the hES cell arrays were also similar to each other (123,522,149, hES01; 139,596,169, H9; ratio 1.1). The average ratio of FUs for each gene element on the hES arrays was also 1.0, but the standard deviation was only ± 0.2. The sums of FUs for the fibroblast and iPS cell arrays were lower (117,920,961, fibros; 88,483,505, iPS). Comparing individual gene elements, the average ratio of 8-Cell FUs to hES FUs for each gene element was 19 ± 371, and an even greater difference for 8-Cells relative to fibroblasts, 65 ± 1005, well beyond the variation between the replicate data sets.

We also examined the relative fluorescence of four common RT-PCR reference genes: ACTB, GADPH, RPLP0, UBC. The ratio of 8-Cell A to 8-Cell B varied from 0.6 (ACTB) to 2.4 (UBC), possibly reflecting variation in mRNA amplification, although the ratio of the average 8-Cell signal to the average hES cell signal was within a similar range, 0.2 (GADPH) to 2.4 (RPLP0). Taken together, the six array data sets were in good agreement for commonly expressed genes, as reported [11]. We reasoned, therefore, that for the purpose of the analyses presented here, ratios ± 7-fold (2 standard deviations from the mean ratio of the 8-cell array elements multiplied by the overall 1.5 higher total 8-Cell array signal) are well outside array fluctuations and probably biologically relevant.

The lists of gene elements involved in circadian rhythmn (Table 1), cell cycle regulation (Tables S2 and 2), and chromosome duplication (Tables S3 and 2) were compiled by comparing GO terms (www.geneontology.org), KEGG (www.genome.ad.jp/kegg) and Reactome (www.reactome.org) pathways, DAVID 2008 (david.abcc.ncifcrf.gov), and hand annotated for accuracy.

Table 1 Circadian rhythm gene elements
Table 2 Gene elements elevated on 8-cell arrays

Results

Overview of array

Almost half of the gene elements detected above threshold level were common to all four cell types, the highest signal intensities of which (1,030 elements, Table S1, bedfordresearch.org/supplement) were enriched for GO terms related to ribonucleoprotein, RNA processing and protein processing. Of the remainder, the 8-Cell arrays exhibited differential expression of gene elements involved in circadian rhythm and cell division.

Circadian clock

Most gene elements of the core intracellular circadian oscillator were over-expressed on the 8-Cell arrays, including CLOCK, detected up to 56-fold higher on the 8-Cell arrays than the hES arrays, and Period (PER1,−2,−3), detected up to 45-fold higher on the 8-Cells arrays (Table 1).

Cell cycle

Of the 138 genes (210 gene elements) on the microarrays directly involved in regulation of the cell cycle, 81 (57%) were detected within ± 7-fold FUs on all six microarrays (Table S2, bedfordresearch.org/supplement). Four gene elements, CDK6, E2F3, RB1 and WEE1, were silent only on the 8-Cell arrays (Tables S2). In contrast, AURKC, CCNA1 and CCNB3 were detected above threshold levels specifically on the 8-Cell arrays, with AURKA and —B at least 8-fold higher on the 8-Cell arrays (Table S2, Table 2). Cyclins A1, −B3, −E1, —and −G2 were detected at much higher levels on the 8C arrays, as was the kinase inhibitor, CDKN1A(p21), and the kinase activator, CDC25A, each detected over 26-fold higher on the 8-cell arrays (Tables 2 and S2). UHRFI, known to downregulate RB1 [14], was detected up to 42-fold higher on the 8-Cell arrays than the hES cell arrays (Tables S2 and 2). UHRF2 (NIRF) [15], detected up to 11-fold higher on the 8-Cell arrays than the hES cell arrays, is capable of blocking G1 progression [16] independent of RB.

Chromosome duplication

Of the 126 genes (189 gene probes) on the microarrays directly involved in DNA replication (Table S3), 96 (76%) exhibited ± 7-fold FUs on all arrays. Gene elements essential for the initiation of DNA replication [17], ORCs, CDC6, CDT1 and MCMs, were detected on the gene arrays of all four cell types at approximately equivalent levels (Table S3). The four members of the GINS complex, essential for both the initiation and extension of DNA replication [1821] are the most over-expressed on the 8-Cell arrays (up to 37-fold higher than hES cells) and the most under-expressed in the fibroblasts (Table S3 and 2).

Parathymosin (PTMS), a highly conserved nuclear protein, was detected 13-fold higher on the 8-Cell arrays than the hES arrays and 37-fold higher than the fibroblast arrays (Tables S3 and 2). DNA polymerase sigma (POLS, also DNA polymerase kappa), thought to be involved in DNA strand synthesis around cohesin sites, was detected at 20-fold higher levels on the 8-Cell arrays. Two members of cohesin, STAG3 and STAG3L3, were detected >65-fold higher on the 8-Cell arrays (Tables 2 and S3).

Discussion

Circadian clock

The circadian clock is an elegantly simple network, highly conserved from bacteria to man, of transcriptional and translational feedback loops that complete one cycle approximately every 24 h [2224], perhaps to ensure DNA synthesis during the night to avoid ultraviolet light-induced damage [25]. The core mammalian pacemaker is comprised of the transcription factors CLOCK, ARNTL(BMAL1), Period (PER1, −2, −3) and Cryptochrome (CRY1,−2). CLOCK and ARNTL form a heterodimer that binds to the E-box elements of PER(s) and CRY(s) to stimulate their expression. Upon reaching a critical concentration, PER/CRY heterodimers inhibit CLOCK/ARNTL, thus leading to a decrease in their own expression. The decrease in PER/CRY allows resumption of CLOCK/ARNTL stimulation of PERs and CRYs, thus repeating the cycle. Mutations in CLOCK have revealed that NPAS2 can at least partially substitute for CLOCK to regulate PER/CRY expression. CLOCK/NPAS2/ARNTL is considered the positive branch of the loop, and PER/CRY the negative branch of the loop.

An estimated 5% to 10% of genes exhibit expression with a circadian pattern [26, 27], some of which augment the core pacemaker, but by definition, elimination or mutation of core elements alters or extinguishes the rhythm [28]. In addition to the gene elements whose expression cycles when the pacemaker is functioning, there is a growing list of genes that do not cycle, but whose expression is influenced by the clock [26, 27]. There is increasing evidence that the pacemaker is active in all cells, and that genes regulated by the clock are cell and tissue specific key pathway elements, such as the circadian expression of heme and factor VII by the liver [29, 30]. Individual fibroblasts maintain their circadian rhythm for up to two weeks in culture [31, 32]. CLOCK also has histone acetyltransferase activity, providing a new clue to the breadth of its transcriptional regulation activity [33].

Period (PER1,−2,−3), the first circadian gene identified in Drosophila [34] and recently described in early zebrafish embryos [35] was detected up to 52-fold higher on the 8-Cell arrays relative to other cells. The onset of Per expression following fertilization of zebrafish eggs establishes an autonomous circadian rhythm in zebrafish embryo cells that entrains a number of pathways, including DNA synthesis.

Cell cycle

An essential feature of the cell cycle is that it proceeds in one direction, to ensure one, and only one, complete replication of chromosomes, enforced by stage specific kinases activated and/or suppressed at key points in the cycle [17, 36]. Important kinase activators are the cyclins, whose expression rises and falls with each cell cycle stage. Cyclins form heterodimers with specific kinases, generally to activate them and define their substrates, but sometimes to suppress their activity. Cell cycle kinase inhibitors, such as CDKN1A(p21), and checkpoint proteins, such as Rb and Wee1, block cell cycle progression, presumably to ensure that critical, stage specific steps are completed.

Cyclin D (CCND) expression is generally stimulated by growth factors to start progression through Gap1 (G1) of the cell cycle; it combines with CDK4 (or CDK6) to phosphorylate a number of proteins, including inactivation of RB, leading to expression of E2F transcription factor-regulated genes such as Cyclin E, needed for the G1/S transition. That the lack of RB expression could render cells immortalized without growth factor stimulation was demonstrated by knocking out RB in mouse fibroblasts, which eliminated the G1 checkpoint, allowing cell growth without growth factors [37]. The relative expression of CCND1, −2, and −3 varies widely among different cell types, including those reported here (Table S2); however, all three appear to be necessary for normal progression of G1 [38].

UHRF1 (ubiquitin PHD RING Finger family member 1, ICBP90) and UHRF2 (NIRF) are members of a newly described tumor suppressor family [15], known to down regulate Rb [16]. Generally, cyclin D/CDK4/6 neutralizes RB allowing expression of cyclin E (CCNE), an activator of Cdk2. CDK4 was detected at approximately the same levels in all microarrays, but as previously noted, CDK6 was silent on the 8-Cell arrays. CCNE1 and −2 were detected at nearly an order of magnitude higher on the 8Cell arrays (Tables S2 and 2), perhaps due to the absence of RB. Activation of cyclinE/Cdk2 leads to G1 to S transition, a block to which may be brought about by UHRF2 [16], thus possibly providing an alternate to the RB block to G1/S transition.

MYC, a well studied oncogene that also encodes an E-box element transcription factor, has been reported to stimulate G1 to S transition by a parallel pathway that functions somewhat independent of E2F-induced transcription, although both MYC and E2F are required for normal cell cycles [39]. CDKN1A is a key negative regulator of cell cycle kinases that displays a circadian pattern of expression in mouse muscle and liver [27], and is also negatively regulated by MYC [40]. These considerations suggest the growth factor/CCND/RB paradigm of most cells may be replaced by CCND/UHRF2/MYC/CDKN1A in the 8-Cells.

S Phase

Activated CCNE/CDK2 indces the expression of cyclin A (CCNA), necessary to initiate DNA replication. Both CCNA1 and −2 were detected on the 8-Cell arrays, but only CCNA2 was detected on all other arrays, although at 10-fold lower levels than the 8-cell arrays. CCNA2 is essential for post-blastocyst mouse embryo development [41]. It binds Cdk2 during S phase and Cdk1(Cdc2) during G2. Dephosphorylation of Cdk2 by Cdc25A is required to activate cyclinA/Cdk2 in S phase [42]. In the mouse, CCNA1 is expressed in the testis, not the ovary [43] and is required for the progression of male, but not female, germ cells through the first meiotic division. In contrast, human CCNA1 is not only expressed in the testis, but in several types of leukemic cells, in which it inhibits apoptosis [44]. In frog embryos, cyclinA1 predominates from fertilization through gastrulation, and combines only with Cdk2, not Cdk1, suggesting it does not play a role in stimulating S phase, but only in entry to mitosis, and supports apoptosis of frog embryo cells that have accumulated DNA damage [45].

The ORC-MCM complex is joined by the GINS complex (Sld5, Psf1, Psf2, Psf3, or 5-1-2-3 which, in Japanese, is ‘Go-Ichi-Ni-San’), essential for both the initiation and extension of DNA replication [1821]. Interestingly, the GINS elements are also up-regulated in the iPS cells relative to fibroblasts (Table 2), supporting their beneficial, but unknown, role in pluripotency.

PTMS has been localized to DNA replication forks [46], in association with glucocorticoid response elements [47], and shown to compete with histone H1, resulting in chromatin remodeling [48], suggesting an enhanced role in both DNA replication and transcription in the 8-Cells.

PolA/primase initiates DNA strand replication, and PolD, a more processive enzyme that elongates strands, were detected at similar levels in all four cell types (Table S3).

PolS is a link between DNA replication and the establishment, and maintenance, of cohesion sites during chromosome duplication [49]. One thought is that cohesions, tightly bound to chromatin, require the action of a specially adapted DNA polymerase to continue strand synthesis. Replication may proceed via PolD to a cohesion site, at which point polymerase switching occurs and PolS continues strand elongation through the cohesion site. Cohesin is a complex of proteins that includes STAG3 and STAG3L3 thought to be established in part by the replication machinery and essential for accurate chromosome duplication [49, 50].

Cyclin G2 (CCNG2) is an atypical cyclin that does not bind to a cyclin-dependent kinase, and is generally associated with cell cycle arrest, particularly in response to DNA damage [51]. It associates with centrosomes, so perhaps its role in the 8-Cells is to ensure accurate and timely centrosome replication.

G2 phase

The transition from S phase to G2 phase involves cyclinA/cdk1 (cdc2) activity, completion of DNA replication, nucleosome reformation and the beginning of centrosome replication. CyclinA still bound to cdk2 also plays a role in progression through G2 by coordinating cyclinB/cdk1 activity at the centrosomes and in the nucleus [42]. Pituitary tumor-transforming gene (PTTG) and its binding protein (PTTG1IP) are involved in multiple cell functions, including delaying the onset of mitosis for DNA repair, and stabilizing sister chromatid association as a securin that binds and inactivates separase for most of the cell cycle [52]. The intronless variants, PTTG2 and PTTG3, are expressed in a variety of tissues, but their function is unknown.

The G2/M transition is coordinated by cyclinB (CCNB) replacing cyclinA in association with Cdk1. It may be significant that CCNB1 levels correlated with increased pluripotency, being lowest in the fibroblasts, higher in the iPS cells, and highest in the 8-Cells (Tables S3 and 2). Cyclin B3, a recently identified, unusually large cyclin detected in human testis [53], at lower levels in other tissues during all phases of the cell cycle, was detected only on the 8-Cell arrays (Tables S2 and 2). The function of CCNB3 is unknown; it binds to but does not activate Cdk2, suggesting it may play a negative regulatory role [54]. Recent evidence suggests that in addition to other roles, CDC25B, the phosphatase that activates Cdk1, plays an important role in synchronizing centrosome duplication to mitosis [55].

WEE1, a key G2 phase cell cycle regulator, silent on the 8-Cell arrays, has a circadian pattern of expression in both muscle and liver of the mouse [27], in addition to which, expression is negatively regulated by TBPL1 [56], detected 19-fold higher on the 8-Cell arrays (Table S3).

Mitosis

The Aurora kinases, A, B and C are highly conserved serine/threonine protein kinases that have important regulatory roles during cell division. Aurora A is central to mitotic spindle organization; Aurora B is localized to centromeres from anaphase to telophase, phosphorylates Histone H3 during mitosis, and remains associated with the spindle midzone where the contractile ring forms [57]. Aurora C co-localizes with Aurora B during the cell cycle, and may play a similar function and prior to this report was found predominantly in the testis [58].

Once spindle alignment of chromosomes passes the spindle checkpoint, the orderly progression through mitosis is principally regulated by degradation of proteins in sequence. First cyclinA, followed by cyclinB and securin, which releases separase to degrade cohesin allowing sister-chromatid separation. Each of these elements was detected on the 8-Cell arrays at levels equal to or greater than the other cell types, suggesting the 8-Cell demonstrates robust expression of proteins important to accurate chromosome duplication and division.

Conclusions

These findings indicate that RB is silenced in the 8-Cells, perhaps by UHRF1, and that the G1 checkpoint generally imposed by RB may be at least partially replaced by UHRF2. WEE1 is suppressed in the 8-Cells, perhaps by TBPL1, and the G2 checkpoint generally imposed by Wee1 may be replaced by Plk1 whose activation in G2 by Pak1 may be controlled by the circadian expression of CDKN1A, which is also under MYC control. The silence of both RB and WEE1, along with up-regulation of UHRF, MYC, CCNA1, PTTG, AURC and PLK1 may be markers of totipotency.

The silence of RB and WEE1 in the 8-Cells was a surprise, but supports the concept that early, totipotent blastomeres do not depend on outside stimuli to overcome a G1 block, and other mechanisms, such as enhanced expression of cohesin proteins, ensure accurate DNA replication and spindle arrangement of chromosomes during G2. More details of these mechanisms will become available when the expression profile of transcription and translation factors on these arrays are analyzed.

There are significant potential pitfalls in drawing conclusions about gene function from microarray analyses, especially those in which the mRNAs have been enzymatically amplified, even if that has been demonstrated to be linear. Certainly mRNA expression does not accurately predict protein abundance or activity, but several lines of evidence suggest the major differences in levels of array fluorescence for the key cell cycle gene elements described in this report reflect unique cell cycle controls in the totipotent 8-Cell embryos. A recent report [59] that miRNA-34a down regulates CDK6, CCNE2 and E2F3, provides a provocative explanation that miRNA-34a in the 8-Cells accounts for the low or absent detection of CDK6, CCNE2 (but not CCNE1), and E2F3, but not the other E2Fs (Tables S2 and 2). The detection of CCNA1, CCNB3 and AURKC only on the 8-Cell arrays is not likely due to artificial loss from the other arrays because the results were reproducible between two laboratories on two continents. A similar argument holds for the circadian clock genes, the expression of which by the 8-Cells supports growing evidence that circadian control of gene expression needs to be more widely appreciated in the design of culture conditions for embryos.