OCT4 (Octamer-Binding Transcription Factor 4)
OCT4 is a member of the class 5 POU (Pit-Oct-Unc) family of homeodomain transcription factors. It is so named because of its specific binding to the canonical octamer motif (with consensus sequence ATGCA/TAAT) at target gene promoter or enhancer regions. OCT4 was originally discovered in mouse embryonal carcinoma cells (ECCs) (Lenardo et al. 1989), embryonic stem cells (ESCs) (Okamoto et al. 1990), and primordial germ cells (PGCs) (Scholer et al. 1989). Early studies focused on its roles in mouse embryogenesis at the preimplantation development stage (Pesce and Scholer 2001; Wu and Scholer 2014). With the success of culturing human ESCs, the mechanisms by which OCT4 cooperates with other key pluripotency factors (such as SOX2, NANOG) in maintaining the self-renewal and pluripotency of ESCs have been unfolded to a large extent (Li and Belmonte 2017). Since the groundbreaking discovery that a set of transcription factors including OCT4 can reprogram mouse fibroblasts to induced pluripotent stem cells (iPSCs) (Takahashi and Yamanaka 2006), much efforts have been made in understanding at the molecular level how OCT4 acts as a pioneer factor in initiating the reprogramming (Zaret and Mango 2016).
Gene and Protein Structure
Human OCT4 protein has 360 amino acids and consists of an N-transactivation domain (NTD, 137-amino acids), a POU domain (152-amino acids), and a C-transactivation domain (CTD, 71-amino acids). The POU domain contains two structurally independent DNA-binding domains (an N-terminal 75-amino acid POUS (POU specific) domain and a C-terminal 60-amino acid POUH (POU homeo) domain) that are connected by a linker of 17 amino acids (Fig. 1b). POUS and POUH domains can independently and flexibly bind half sites of the canonical octamer motif (ATGCA/TAAT) through which OCT4 recognizes the promoter or enhancer regions of its target genes (Fig. 1c). This flexibility allows OCT4 to form heterodimers with other transcription factors (such as SOX2, Fig. 1c) and to form homodimers in PORE motif (ATTTGAAAT/GGCAAAT)- or MORE motif (ATGCATATGCAT)-binding conformation, depending on the positioning of POUS and POUH domains relative to each other (Jerabek et al. 2014).
Expression and Regulatory Mechanisms
In vivo, OCT4 is mainly expressed in unfertilized oocytes, zygotes, early embryos (two-cell to blastocyst stage), and PGCs and silenced during gastrulation and restricted to the germ cells in adults. In culture system, OCT4 is abundantly expressed in ESCs and ECCs. A predominant nuclear localization of OCT4 in the abovementioned cells has been established despite it can shuttle between the nucleus and the cytoplasm (Yang et al. 2014). However, the detection of OCT4 in adult stem cells, somatic cancer tissues, and their derived cancer cell lines still remains controversial (Wang and Herlyn 2015). Studies in ESC self-renewal and somatic cell reprogramming indicated that an optimally intermediate level of OCT4 is associated with maximal stemness or pluripotency, either higher or lower levels of OCT4 can lead to differentiation (Karwacki-Neisius et al. 2013; Niwa et al. 2000). The expression of OCT4 is tightly controlled at multiple levels as follows.
Epigenetic and Transcriptional Regulation
The upstream of the transcription start site of the mouse Pou5f1 gene harbors three regulatory elements for its transcription, namely, distal enhancer (DE), proximal enhancer (PE), and proximal promoter (PP) (Kellner and Kikyo 2010; Pan et al. 2002) . The two enhancers are activated differentially during the development of mouse embryos, with the DE driving Pou5f1 expression in the inner cell mass (ICM), ESCs and PGCs, while the PE activating Pou5f1 expression in epiblast cells. Several nuclear orphan receptors bind to the three regulatory elements to activate or suppress Pou5f1 transcription in mouse ESCs. The best studied example is germ cell nuclear factor (GCNF) that is rapidly upregulated during retinoic acid-induced ESC differentiation. GCNF suppresses Pou5f1 transcription through binding to the PP and mediating its methylation catalyzed by DNA methyltransferases DNMT3A and DNMT3B. Several histone modifications take place in the vicinity of the PP when Pou5f1 becomes inactive during early differentiation of mouse ESCs and ECCs. The methyltransferase G9A (EHMT2) catalyzes the di- and trimethylation of lysine 9 on histone H3 (H3K9me2 and H3K9me3) through its SET domain, which then recruits HP1 and trigger heterochromatin formation at the Pou5f1 locus. Also, G9A can recruit DNMT3A and DNMT3B via its ankyrin repeat to induce the DNA methylation of the three regulatory elements and thus the silencing of the Pou5f1 gene. When somatic cells are dedifferentiated and acquire pluripotency, their POU5F1 locus is expected to become demethylated and reactivated.
Alternative Splicing and Translation Initiation
The human POU5F1 transcript can generate three main isoforms by alternative splicing (Fig. 1a), namely, OCT4A (often referred to as OCT4), OCT4B, and OCT4B1 (Wang and Dai 2010). OCT4A is by far the most studied isoform given its crucial roles in early development, pluripotent stem cell (PSC) maintenance, and somatic cell reprogramming. In recent years, there has been increasing interest in OCT4B, which cannot sustain PSC self-renewal but may respond to cell stress. In some cancer cell lines, OCT4A and OCT4B mRNA were co-expressed, and OCT4B can modulate OCT4A expression as a noncoding RNA, mimicking the way that microRNAs function. OCT4B1 is a more recently discovered OCT4 spliced variant. Although it has been considered as a putative marker of stemness, its definitive function remains unclear.
OCT4B and OCT4B1 proteins mainly differ from OCT4A at the N-terminal region (Fig. 1a). The OCT4B transcript contains the exon 2a instead of the exon 1 sequence that introduces an internal ribosomal entry site and different start codons to allow for alternative translation initiation (Jerabek et al. 2014). Therefore, the OCT4B transcript can be translated into three different gene products, OCT4B-265, OCT4B-190, and OCT4B-164, that share identical POUH domains and C-terminal regions. OCT4B-265 has an N-terminal domain that is different from that of OCT4A, while OCT4B-190 and OCT4B-164 do not have an N-terminal domain.
Posttranslational modifications (PTMs) refer to chemical modifications of a protein by covalently and reversibly adding a functional group or a protein to its specific amino acid residue(s). A variety of PTMs including phosphorylation, ubiquitination, sumoylation, and glycosylation have been identified as important regulatory mechanisms for OCT4 present in ESCs and ECCs (Cai et al. 2012; Shi and Jin 2010; Wang et al. 2014). These PTMs play essential roles in regulating the structure, activity, and localization of the OCT4 protein and its interactions with other cellular components. The reversible PTMs are well placed to sense, relay, and integrate a variety of extracellular and intracellular signals in PSCs. For instance, the phosphorylation of OCT4-T235 by AKT orchestrates the regulation of its stability, subcellular localization, and transcriptional activities, which collectively promotes the survival and tumorigenicity of ECCs (Lin et al. 2012). In an in-depth PTM identification study, 14 phosphorylation sites in OCT4 protein were collectively identified in multiple human ESC samples by liquid chromatography mass spectrometry (Brumbaugh et al. 2012). Such combinatorial PTMs may serve like a bar code and function in concert to allow for the storage and transduction of highly specific signals to control epigenetics-based gene transcription. Future studies will be directed toward a time course-dependent profiling of the OCT4 PTMs in response to certain stimuli in a given cell type and a systematic profiling and comparing of the OCT4 PTMs in different cellular contexts. Such work may assist unraveling the physiological stimuli and the molecular mechanisms that modulate OCT4 oligomerization and its various conformations in vivo.
Target Genes and Main Functions
Genome-wide DNA microarray, chromatin immunoprecipitation (ChIP)-based analyses combined with bioinformatics analyses have identified hundreds of target genes that are transcriptionally controlled by OCT4 in PSCs (Babaie et al. 2007; Boyer et al. 2005; Jung et al. 2010). Six distinct OCT4-binding modules for all the OCT4 target genes have been defined: OCT4-SOX2 motif, OCT4 monomer motif, SOX2 monomer motif, no OCT4-no SOX2 motif, OCT4 PORE motif, and OCT4 MORE motif (Jung et al. 2010). In addition to protein-coding genes, noncoding RNA targets (e.g., miR-302, long noncoding RNA AK028326) of OCT4 have also been found (Li and Belmonte 2017; Shi and Jin 2010). Overall, OCT4, SOX2, and NANOG form a self-reinforcing and intricately connected network to preserve the characteristics of ESCs by activating the self-renewal genes while suppressing the differentiation genes (Li and Belmonte 2017; Ng and Surani 2011; Shi and Jin 2010). In comparison, OCT4 target genes in somatic cancer cells are much less identified (Wang and Herlyn 2015). The main functions of OCT4 are briefly summarized as follows according to the biological processes it participates in.
Early Embryo Development
The critical roles of OCT4 in development were established mainly by ablating its functions in vivo. OCT4-deficient mouse embryos wherein the DNA-homozygous OCT4-deficient mouse embryos remain viable through the morula stage but do not give rise to mature ICM cells or produce parietal endoderm at any stage in culture, indicating that OCT4 is essential for the pluripotency in the ICM (Nichols et al. 1998). In studies using morpholino oligonucleotide-based gene knockdown and genome profiling, OCT4 was determined as a key activator of zygotic genes in zebra fish, and thus a similar role for OCT4 in mammalian development was proposed. In contrast, genetic ablation of maternal OCT4 in mouse revealed that the establishment of totipotency in maternal OCT4-depleted embryos was not affected and that these blastocysts were still able to undergo normal cavitation, and the OCT4-GFP and NANOG-GFP reporter genes were expressed in the derived ICM cells which however were not pluripotent (Wu et al. 2013). These results indicate that OCT4 is not essential for the initiation of pluripotency, in contrast to its crucial role in maintaining pluripotency (Wu et al. 2013; Wu and Scholer 2014). An interesting observation is that OCT4 transcript and protein levels fluctuate during the early embryonic development (Wu and Scholer 2014). The requirement of sustained OCT4 expression for subsequent lineage specification was inferred from a study utilizing transgenic mouse embryos in which both maternal and zygotic OCT4 were excised. Embryos devoid of OCT4 gave rise to ICM-like structures but failed to produce primitive endoderm (Le Bin et al. 2014).
PSC Self-Renewal and Pluripotency
With cultured PSC systems, it has been well established that the undifferentiated state of PSCs is governed by a network of transcription factors, including OCT4, SOX2, NANOG, KLF5, ESRRB, and TBX3, which repress differentiation-promoting genes while activating pluripotency genes (He et al. 2009; Li and Belmonte 2017; Ng and Surani 2011). Among them, OCT4, SOX2, and NANOG are considered to be the master pluripotency factors as each of them is unique and indispensable for pluripotency and self-renewal (He et al. 2009). Remarkably, they can regulate their own or each other’s gene transcription via combinatorial interactions, forming a positive feedback transcriptional regulatory circuit that suppresses differentiation. Furthermore, they are at the center of a highly integrated regulatory network composed of many transcriptional and epigenetic regulators (Li and Belmonte 2017; Ng and Surani 2011). Compelling evidence shows that altering the stoichiometry of cellular OCT4/SOX2 will trigger differentiation. For instance, knocking down either OCT4 or SOX2 in mouse ESCs led to their differentiation into trophectoderm-like cells, and elevating OCT4 or SOX2 protein levels in ESCs induced their differentiation into either primitive endoderm and mesoderm or most non-endoderm lineages, respectively (Li and Belmonte 2017; Niwa et al. 2000). Furthermore, OCT4 is upregulated in cells choosing the mesendoderm fate but repressed in cells choosing the neural ectoderm fate, and SOX2 exhibited the opposite expression pattern in both cell fates (Thomson et al. 2011). Thus, it is likely that during gastrulation the developing embryo has differentiation signals that continuously and asymmetrically modulate OCT4 and SOX2 protein levels, altering their binding pattern and binding targets in the genome and leading to cell fate choices. Specific disruption of OCT4/SOX2 interaction in human ECCs led to their differentiation toward mesendodermal lineages (Pan et al. 2016).
Somatic Cell Reprogramming
Converting the terminally differentiated somatic cells back to ESC-like pluripotent cells (known as iPSCs) was first achieved using the Yamanaka factors (OCT4, SOX2, KLF4, and c-MYC) or the Thomson factors (OCT4, SOX2, NANOG, and LIN28), both of which included OCT4 (Shi and Jin 2010; Takahashi and Yamanaka 2016). Subsequent studies showed that although those transcription factors can all be substituted by alternative transcription factors or small-molecule compounds, OCT4 is the most difficult one to be replaced (Jaenisch and Young 2008; Jerabek et al. 2014; Shi and Jin 2010). Neural stem cells and progenitors can be fully reprogrammed to iPSCs with only OCT4 (Kim et al. 2009), again highlighting the unique importance of OCT4 in cell fate conversion. The reprogramming process is considered to comprise three distinct phases: initiation phase, intermediate phase, and maturation and stabilization phase (Buganim et al. 2013). During the initiation phase, exogenous OCT4 together with other pluripotency factors functions as “pioneer factors,” binding to chromatin regions that are not accessible to other factors and leading to the remodeling of chromatin regions, thus activating or repressing gene expression (Buganim et al. 2013; Zaret and Mango 2016). Many components of the OCT4-centered pluripotency network and critical epigenetic modifiers are activated rapidly in the initiation phase. OCT4 then activates the expression of mesendodermal genes and suppresses that of ectodermal genes. Conversely, SOX2 promotes ectodermal while suppresses mesendodermal gene expression. Such expression pattern (OCT4 high, SOX2 low) elicits the mesendodermal features that are often seen in cells undergoing reprogramming and appears to be important for progressing to subsequent reprogramming phases. Taken together, OCT4 mainly functions at the early phase of somatic cell reprogramming, and its stoichiometric ratio over other pluripotency factors may determine the route of reprogramming (Takahashi and Yamanaka 2016).
Besides the above well-characterized roles, OCT4 is also implicated to participate in a number of other processes. First, OCT4 deficiency leads to apoptosis of PGCs rather than their differentiation into a trophectodermal lineage, indicating an uncharacterized role of OCT4 in maintaining viability of mammalian germ line (Kehler et al. 2004). Second, recent studies revealed that OCT4 can promote glycolysis by transcriptionally upregulating the expression of several key glycolytic enzymes, directly linking ESC metabolism to their self-renewal and pluripotency. Third, emerging evidence implicated a direct role of OCT4 in controlling the transcription of several key cell cycle regulators. In general, OCT4 appears to directly or indirectly activate the transcription of cell cycle machineries that promote G1/S transition and avoid differentiation. Meanwhile, by suppressing multiple cell cycle genes, OCT4 controls proper duration of G2 phase to ensure the genomic integrity via both the transcription-dependent and transcription-independent mechanisms. Reciprocally, the cell cycle regulators especially CDK1 can directly interact with OCT4 and promote its suppressive binding to the differentiation genes and thereby maintaining the ESC pluripotency. Moreover, although still controversial, OCT4 is widely considered to play a critical role in the self-renewal, survival, metastasis, and drug-resistance development of a subpopulation of undifferentiated cancer cells known as cancer stem cells or tumor-initiating cells (Wang and Herlyn 2015).
OCT4 is a transcription factor mainly present in early embryos and adult germ cells. Its expression is tightly controlled at epigenetic, transcriptional, and posttranscriptional levels, while its functionality is mainly regulated at posttranslational level. It plays an indispensable role in maintaining the pluripotency and self-renewal of early embryonic cells in vivo and PSCs in vitro. By serving as a pioneer factor, OCT4 also functions crucially at the early phase of somatic cell reprogramming. Its involvement in PSC metabolism and cell cycle regulation is implicated in emerging studies. The multiple oligomerization and configuration modes of OCT4 protein and its rather flexible and diversified interactions with hundreds of target genes and binding partners may account for its unique and pleiotropic roles in the above processes.
This work was supported by the National Key Research and Development Program of China (Grant No. 2016YFA0100300) and the National Natural Science Foundation of China (Grant No. 31601103).
- Karwacki-Neisius V, Goke J, Osorno R, Halbritter F, Ng JH, Weisse AY, Wong FC, Gagliardi A, Mullin NP, Festuccia N, et al. Reduced Oct4 expression directs a robust pluripotent state with distinct signaling activity and increased enhancer occupancy by Oct4 and Nanog. Cell Stem Cell. 2013;12:531–45.PubMedPubMedCentralCrossRefGoogle Scholar