Epigenomics of human embryonic stem cells and induced pluripotent stem cells: insights into pluripotency and implications for disease
- 15k Downloads
Human pluripotent cells such as human embryonic stem cells (hESCs) and induced pluripotent stem cells (iPSCs) and their in vitro differentiation models hold great promise for regenerative medicine as they provide both a model for investigating mechanisms underlying human development and disease and a potential source of replacement cells in cellular transplantation approaches. The remarkable developmental plasticity of pluripotent cells is reflected in their unique chromatin marking and organization patterns, or epigenomes. Pluripotent cell epigenomes must organize genetic information in a way that is compatible with both the maintenance of self-renewal programs and the retention of multilineage differentiation potential. In this review, we give a brief overview of the recent technological advances in genomics that are allowing scientists to characterize and compare epigenomes of different cell types at an unprecedented scale and resolution. We then discuss how utilizing these technologies for studies of hESCs has demonstrated that certain chromatin features, including bivalent promoters, poised enhancers, and unique DNA modification patterns, are particularly pervasive in hESCs compared with differentiated cell types. We outline these unique characteristics and discuss the extent to which they are recapitulated in iPSCs. Finally, we envision broad applications of epigenomics in characterizing the quality and differentiation potential of individual pluripotent lines, and we discuss how epigenomic profiling of regulatory elements in hESCs, iPSCs and their derivatives can improve our understanding of complex human diseases and their underlying genetic variants.
KeywordshESC Line iPSC Line Chromatin Feature Histone PTMs Chromatin Signature
differentially methylated region
embryonic stem cell
human embryonic stem cell
trimethylation of lysine 4 of histone H3
acetylation of lysine 27 of histone H3
trimethylation of lysine 27 of histone H3
induced pluripotent stem cell
One genome, many epigenomes
Embryonic stem cells (ESCs) and the early developmental stage embryo share a unique property called pluripotency, which is the ability to give rise to the three germ layers (endoderm, ectoderm and mesoderm) and, consequently, all tissues represented in the adult organism [1, 2]. Pluripotency can also be induced in somatic cells during in vitro reprogramming, leading to the formation of so-called induced pluripotent stem cells (iPSCs; extensively reviewed in [3, 4, 5, 6, 7]). In order to fulfill the therapeutic potential of human ESCs (hESCs) and iPSCs, an understanding of the fundamental molecular properties underlying the nature of pluripotency and commitment is required, along with the development of methods for assessing biological equivalency among different cell populations.
Functional complexity of the human body, with over 200 specialized cell types, and intricately built tissues and organs, arises from a single set of instructions: the human genome. How, then, do distinct cellular phenotypes emerge from this genetic homogeneity? Interactions between the genome and its cellular and signaling environments are the key to understanding how cell-type-specific gene expression patterns arise during differentiation and development . These interactions ultimately occur at the level of the chromatin, which comprises the DNA polymer repeatedly wrapped around histone octamers, forming a nucleosomal array that is further compacted into the higher-order structure. Regulatory variation is introduced to the chromatin via alterations within the nucleosome itself - for example, through methylation and hydroxymethylation of DNA, various post-translational modifications (PTMs) of histones, and inclusion or exclusion of specific histone variants [9, 10, 11, 12, 13, 14, 15] - as well as via changes in nucleosomal occupancy, mobility and organization [16, 17]. In turn, these alterations modulate access of sequence-dependent transcriptional regulators to the underlying DNA, the level of chromatin compaction, and communication between distant chromosomal regions . The entirety of chromatin regulatory variation in a specific cellular state is often referred to as the 'epigenome' .
Technological advances have made the exploration of epigenomes feasible in a rapidly increasing number of cell types and tissues. Systematic efforts at such analyses had been undertaken by the human ENCyclopedia Of DNA Elements (ENCODE) and NIH Roadmap Epigenomics projects [20, 21]. These and other studies have already produced, and will generate in the near future, an overwhelming amount of genome-wide datasets that are often not readily comprehensible to many biologists and physicians. However, given the importance of epigenetic patterns in defining cell identity, understanding and utilizing epigenomic mapping will become a necessity in both basic and translational stem cell research. In this review, we strive to provide an overview of the main concepts, technologies and outputs of epigenomics in a form that is accessible to a broad audience. We summarize how epigenomes are studied, discuss what we have learned so far about unique epigenetic properties of hESCs and iPSCs, and envision direct implications of epigenomics in translational research and medicine.
Technological advances in genomics and epigenomics
Epigenomics is defined here as genomic-scale studies of chromatin regulatory variation, including patterns of histone PTMs, DNA methylation, nucleosome positioning and long-range chromosomal interactions. Over the past 20 years, many methods have been developed to probe different forms of this variation. For example, a plethora of antibodies recognizing specific histone modifications has been developed and used in chromatin immunoprecipitation (ChIP) assays for studying the local enrichment of histone PTMs at specific loci [22, 23]. Similarly, bisulfite-sequencing (BS-seq)-based, restriction enzyme-based and affinity-based approaches for analyzing DNA methylation have been established [24, 25], in addition to methods to identify genomic regions with low-nucleosomal content (for example, DNAse I hypersensitivity assay)  and to probe long-range chromosomal interactions (such as chromosomal conformation capture or 3C ).
Next-generation sequencing-based methods used in epigenomic studies
Histone post-translational modifications
Chromatin modifiers and remodelers
Nucleosome positioning and turnover
Long-range chromatin interactions
Allele-specific chromatin signatures
Chromatin signatures defining different classes of regulatory elements
Main: H3K4me3/2. Additional: H3ac, H4ac
Poised promoters (bivalent)
Main: H3K4me3/2, H3K27me3. Additional: H2AZ, MacroH2A
More prevalent in ESCs/iPSCs
Inactive promoters (CpG island-poor)
Presence: p300, H3K4me1/2, H3K27ac. Absence: H3K4me3, H3K27me3
Presence: p300, H3K4me1/2, H3K27me3. Absence: H3K4me3, H3K27ac
Prevalent in hESCs
Long non-coding RNAs
promoter: H3K4me3. Gene body: H3K36me3
Epigenomic features of hESCs
ESCs provide a robust, genomically tractable in vitro model to investigate the molecular basis of pluripotency and embryonic development [1, 2]. In addition to sharing many fundamental properties with chromatin of somatic cells, chromatin of pluripotent cells appears to have unique features, such as the increased mobility of many structural chromatin proteins, including histones and heterochromatin protein 1 , and differences in nuclear organization suggestive of a less compacted chromatin structure [48, 49, 50, 51]. Recent epigenomic profiling of hESCs has uncovered several characteristics that, although not absolutely unique to hESCs, appear particularly pervasive in these cells [52, 53, 54]. Below, we focus on these characteristics and their potential role in mediating the epigenetic plasticity of hESCs.
Bivalent domains at promoters
The term 'bivalent domains' is used to describe chromatin regions that are concomitantly modified by the trimethylation of lysine 4 of histone H3 (H3K4me3), a modification generally associated with transcriptional initiation, and trimethylation of lysine 27 of histone H3 (H3K27me3), a modification associated with Polycomb-mediated gene silencing. Although first described and most extensively characterized in mouse ESCs (mESCs) [55, 56], bivalent domains are also present in hESCs [57, 58], and in both species they mark transcription start sites of key developmental genes that are poorly expressed in ESCs, but induced upon differentiation. Albeit defined by the presence of H3K27me3 and H3K4me3, bivalent promoters are also characterized by other features, such as the occupancy of the histone variant H2AZ . Upon differentiation, bivalent domains at specific promoters resolve into a transcriptionally active H3K4me3-marked monovalent state, or a transcriptionally silent H3K27me3-marked monovalent state, depending on the lineage commitment [42, 56]. However, a subset of bivalent domains is retained upon differentiation [42, 60], and bivalently marked promoters have been observed in many progenitor cell populations, perhaps reflecting their remaining epigenetic plasticity . Nevertheless, promoter bivalency seems considerably less abundant in differentiated cells, and appears to be further diminished in unipotent cells [42, 54, 56]. These observations led to the hypothesis that bivalent domains are important for pluripotency, allowing early developmental genes to remain silent yet able to rapidly respond to differentiation cues. A similar function of promoter bivalency can be hypothesized for multipotent or oligopotent progenitor cell types. However, it needs to be more rigorously established how many of the apparently 'bivalent' promoters observed in progenitor cells truly posses this chromatin state, and how many reflect heterogeneity of the analyzed cell populations, in which some cells display H4K4me3-only and others H3K27me3-only signatures at specific promoters.
In multicellular organisms, distal regulatory elements, such as enhancers, play a central role in cell-type and signaling-dependent gene regulation [61, 62]. Although embedded within the vast non-coding genomic regions, active enhancers can be identified by epigenomic profiling of certain histone modifications and chromatin regulators [63, 64, 65]. A recent study revealed that unique chromatin signatures distinguish two functional enhancer classes in hESCs: active and poised . Both classes are bound by coactivators (such as p300 and BRG1) and marked by H3K4me1, but while the active class is enriched in acetylation of lysine 27 of histone H3 (H3K27ac), the poised enhancer class is marked by H3K27me3 instead. Active enhancers are typically associated with genes expressed in hESCs and in the epiblast, whereas poised enhancers are located in proximity to genes that are inactive in hESCs, but which play critical roles during early stages of post-implantation development (for example, gastrulation, neurulation, early somitogenesis). Importantly, upon signaling stimuli, poised enhancers switch to an active chromatin state in a lineage-specific manner and are then able to drive cell-type-specific gene expression patterns. It remains to be determined whether H3K27me3-mediated enhancer poising represents a unique feature of hESCs. Recent work by Creighton et al.  suggests that poised enhancers are also present in mESCs and in various differentiated mouse cells, although in this case the poised enhancer signature did not involve H3K27me3, but H3K4me1 only. Nevertheless, our unpublished data indicate that, similar to the bivalent domains at promoters, simultaneous H3K4me1/H3K27me3 marking at enhancers is much less prevalent in more restricted cell types compared with both human and mouse ESCs (A Rada-Iglesias, R Bajpai and J Wysocka, unpublished observations). Future studies should clarify whether poised enhancers are marked by the same chromatin signature in hESCs, mESCs and differentiated cell types, and evaluate the functional relevance of the Polycomb-mediated H3K27 methylation at enhancers.
Unique DNA methylation patterns
Mammalian DNA methylation occurs at position 5 of cytosine residues, generally in the context of CG dinucleotides (that is, CpG dinucleotides), and has been associated with transcriptional silencing both at repetitive DNA, including transposon elements, and at gene promoters [13, 14]. Initial DNA methylation studies of mESCs revealed that most CpG-island-rich gene promoters, which are typically associated with house-keeping and developmental genes, are DNA hypomethylated, whereas CpG-island-poor promoters, typically associated with tissue-specific genes, are hypermethylated [41, 60]. Moreover, methylation of H3K4 at both promoter-proximal and distal regulatory regions is anti-correlated with their DNA methylation level, even at CpG-island-poor promoters . Nevertheless, these general correlations are not ESC-specific features as they have also been observed in a variety of other cell types [25, 60, 68]. On the other hand, recent comparisons of DNA methylation in early pre- and postimplantation mouse embryos with those of mESCs revealed that, surprisingly, mESCs accumulate promoter DNA methylation that is more characteristic of the postimplantation stage embryos rather than the blastocyst from which they are derived .
Although the coverage and resolution of mammalian DNA methylome maps have been steadily increasing, whole-genome analyses of human methylomes at single-nucleotide resolution require an enormous sequencing effort and have been reported only recently . These analyses revealed that in hESCs, but not in differentiated cells, a significant proportion (approximately 25%) of methylated cytosines are found in a non-CG context. Non-CG methylation is a common feature of plant epigenomes  and, while it has been previously reported to occur in mammalian cells , its contribution to as much as a quarter of all cytosine methylation in hESCs had not been anticipated. It remains to be established whether non-CG methylation in hESCs is functionally relevant or, alternatively, is simply a by-product of high levels of de novo DNA methyltransferases and a hyperdynamic chromatin state that characterizes hESCs [49, 50, 72]. Regardless, its prevalence in hESC methylomes emphasizes unique properties of pluripotent cell chromatin. However, one caveat to the aforementioned study and all other BS-seq-based analyses of DNA methylation is their inability to distinguish between methylcytosine (5mC) and hydroxymethylcytosine (5hmC), as both are refractory to bisulfite conversion [15, 73], and thus it remains unclear how much of what has been mapped as DNA methylation in fact represents hydroxymethylation.
Another, previously unappreciated modification of DNA, hydroxymethylation, has become a subject of considerable attention. DNA hydroxymethylation is mediated by the TET family enzymes , which convert 5mC to 5hmC. Recent studies have shown that mESCs express high levels of TET proteins, and consequently their chromatin is 5hmC-rich [74, 75], a property that, to date, has only been observed in a limited number of other cell types - for example, in Purkinje neurons . Although the functionality of 5hmC is still unclear, it has been suggested that it represents a first step in either active or passive removal of DNA methylation from select genomic loci. New insights into 5hmC genomic distribution in mESCs have been obtained from studies that utilized immunoprecipitation with 5hmC-specific antibodies coupled to next-generation sequencing or microarray technology, respectively [77, 78], revealing that a significant fraction of 5hmC occurs within gene bodies of transcriptionally active genes and, in contrast to 5mC, also at CpG-rich promoters , where it overlaps with the occupancy of the Polycomb complex PRC2 . Intriguingly, a significant fraction of the intra-genic 5hmC occurs within a non-CG context , which prompts investigating whether a subset of the reported non-CG methylation in hESCs might actually represent 5hmC. Future studies should establish whether hESCs show a similar 5hmC distribution to mESCs. More importantly, it will be essential to re-evaluate the extent to which cytosine residues that have been mapped as methylated in hESCs are indeed hydroxymethylated, and to determine the functional relevance of this novel epigenetic mark.
Reduced genomic blocks marked by repressive histone modifications
A comprehensive study of epigenomic profiles in hESCs and human fibroblasts showed that, in differentiated cells, regions enriched in histone modifications associated with heterochromatin formation and gene repression, such as H3K9me2/3 and H3K27me3, are significantly expanded . These two histone methylation marks cover only 4% of the hESC genome, but well over 10% of the human fibroblast genome. Parallel observations have been made independently in mice, where large H3K9me2-marked regions are more frequent in adult tissues in comparison with mESCs . Interestingly, H3K9me2-marked regions largely overlap with the recently described nuclear lamina-associated domains , suggesting that the appearance or expansion of the repressive histone methylation marks might reflect a profound three-dimensional reorganization of chromatin during differentiation . Indeed, heterochromatic foci increase in size and number upon ESC differentiation, and it has been proposed that an 'open', hyperdynamic chromatin structure is a crucial component of pluripotency maintenance [48, 49, 50].
Are hESCs and iPSCs epigenetically equivalent?
Since Yamanaka's seminal discovery in 2006 showing that introduction of the four transcription factors Oct4, Sox2, Klf4 and c-Myc is sufficient to reprogram fibroblasts to a pluripotent state, progress in the iPSC field has been breathtaking [4, 83, 84]. iPSCs have now been generated from a variety of adult and fetal somatic cell types using a myriad of alternative protocols [3, 6, 7]. Remarkably, the resulting iPSCs seem to share phenotypic and molecular properties of ESCs; these properties include pluripotency, self-renewal and similar gene expression profiles. However, an outstanding question remains: to what extent are hESCs and iPSCs functionally equivalent? The most stringent pluripotency assay, tetraploid embryo complementation, demonstrated that mouse iPSCs can give rise to all tissues of the embryo proper [85, 86]. On the other hand, many iPSC lines do not support tetraploid complementation, and those that do remain quite inefficient in comparison with mESCs [85, 87]. Initial genome-wide comparisons between ESCs and iPSCs focused on gene expression profiles, which reflect the transcriptional state of a given cell type, but not its developmental history or differentiation potential [4, 84, 88]. These additional layers of information can be uncovered, at least partially, by examining epigenetic landscapes. In this section, we summarize studies comparing DNA methylation and histone modification patterns in ESCs and iPSCs.
Sources of variation in iPSC and hESC epigenetic landscapes
Bird's eye view comparisons show that all major features of the hESC epigenome are re-established in iPSCs [89, 90]. On the other hand, when more subtle distinctions are considered, recent studies have reported differences between iPSC and hESC DNA methylation and gene expression patterns [90, 91, 92, 93, 94]. Potential sources of these differences can be largely divided into three groups: (i) experimental variability in cell line derivation and culture; (ii) genetic variation among cell lines; and (iii) systematic differences representing hotspots of aberrant epigenomic reprogramming.
Although differences arising as a result of experimental variability do not constitute biologically meaningful distinctions between the two stem cell types, they can be informative when assessing the quality and differentiation potential of individual lines [91, 95]. The second source of variability is a natural consequence of the genetic variation among human cells or embryos from which iPSCs and hESCs are respectively derived. Genetic variation likely underlies many of the line-to-line differences in DNA and histone modification patterns, underscoring the need for using cohorts of cell lines and stringent statistical analyses to draw systematic comparisons between hESCs, healthy donor-derived iPSCs, and disease-specific iPSCs. In support of the significant impact of human genetic variation on epigenetic landscapes, recent studies of specific chromatin features in lymphoblastoid cells [96, 97] isolated from related and unrelated subjects showed that individual, as well as allele-specific, heritable differences in chromatin signatures can be largely explained by the underlying genetic variants. Although genetic differences make comparisons between hESC and iPSC lines less straightforward, we will discuss later how these can be harnessed to uncover the role of specific regulatory sequence variants in human disease. Finally, systematic differences between hESC and iPSC epigenomes may arise through the incomplete erasure of marks characteristic of the somatic cell type of origin (somatic memory) during iPSC reprogramming, or defects in the re-establishment of hESC-like patterns in iPSCs, or as a result of selective pressure during reprogramming and the appearance of iPSC-specific signatures [90, 98]. Regardless of the underlying sources of variation, understanding epigenetic differences between hESC and iPSC lines will be essential for harnessing the potential of these cells in regenerative medicine.
Remnants of the somatic cell epigenome in iPSCs: lessons from DNA methylomes
Studies of stringently defined models of mouse reprogramming have shown that cell-type-of-origin-specific differences in gene expression and differentiation potential exist in early passage iPSCs, leading to the hypothesis that an epigenetic memory of previous fate persists in these cells [98, 99]. This epigenetic memory has been attributed to the presence of residual somatic DNA methylation in iPSCs, most of which is retained within regions located outside of, but in proximity to, CpG islands, at so-called 'shores' [98, 100]. The incomplete erasure of somatic methylation appears to predispose iPSCs to differentiation into fates related to the cell type of origin, while restricting differentiation towards other lineages. Importantly, this residual memory of past fate appears to be transient, and diminishes upon continuous passaging, serial reprogramming or treatment with small molecule inhibitors of histone deacetylase or DNA methyltransferase activity [98, 99]. These results suggest that remnants of somatic DNA methylation are not actively maintained in iPSCs during replication and thus can be erased through cell division.
More recently, whole-genome, single-base-resolution DNA methylome maps have been generated for five distinct human iPSC lines and compared with those of hESCs and somatic cells . That study demonstrated that although the hESC and iPSC DNA methylation landscapes are remarkably similar overall, hundreds of differentially methylated regions (DMRs) exist. Nevertheless, only a small fraction of DMRs represents failure in erasure of somatic DNA methylation, whereas the vast majority corresponds to either hypomethylation (defects in the methylation of genomic regions that are marked in hESCs) or the appearance of iPSC-specific methylation patterns, not present in hESCs or the somatic cell type of origin. Moreover, these DMRs are likely to be resistant to passaging, as the methylome analyses were performed using relatively late passage iPSCs . Due to a limited number of iPSC and hESC lines used in the study, genetic and experimental variation among individual lines may be a big contributor to the reported DMRs. However, a significant subset of DMRs is shared among iPSC lines of different genetic background and cell type of origin, and is transmitted through differentiation, suggesting that at least some DMRs may represent non-stochastic epigenomic hotspots that are refractive to reprogramming.
Reprogramming resistance of subtelomeric and subcentromeric regions?
In addition to erasing somatic epigenetic marks, an essential component of reprogramming is the faithful re-establishment of hESC-like epigenomic features. Although, as discussed above, most of the DNA methylation is correctly re-established during reprogramming, large megabase-scale regions of reduced methylation can be detected in iPSCs, often within the vicinity of centromeres and telomeres . Biased depletion of DNA methylation from subcentromeric and subtelomeric regions correlates with blocks of H3K9me3 that mark these loci in iPSCs and somatic cells, but not in hESCs [79, 90]. Aberrant DNA methylation in proximity to centromeres and telomeres suggests that these chromosomal territories may have features that render them more resistant to epigenetic changes. Intriguingly, histone variant H3.3, which is generally implicated in transcription-associated and replication-independent histone deposition, was recently found to also occupy subtelomeric and subcentromeric regions in mESCs and mouse embryo [36, 101, 102]. It has been previously suggested that H3.3 plays a critical role in the maintenance of transcriptional memory during reprogramming of somatic nuclei by the egg environment (that is, reprogramming by somatic cell nuclear transfer) , and it is tempting to speculate that a similar mechanism may contribute to the resistance of the subtelomeric and subcentromeric regions to reprogramming in iPSCs.
Anticipating future fates: reprogramming at regulatory elements
Relevance of epigenomics for human disease and regenerative medicine
In this section, we envision how recent advances in epigenomics can be used to gain insight into human development and disease, and to facilitate the transition of stem cell technologies towards clinical applications.
Using epigenomics to predict developmental robustness of iPSC lines for translational applications
As discussed earlier, epigenomic profiling can be used to annotate functional genomic elements in a genome-wide and cell-type specific manner. Distinct chromatin signatures can distinguish active and poised enhancers and promoters, identify insulator elements and uncover non-coding RNAs transcribed in a given cell type [42, 56, 63, 64, 66, 104, 105] (Table 2). Given that developmental potential is likely to be reflected in the epigenetic marking of promoters and enhancers linked to poised states, epigenomic maps should be more predictive of iPSC differentiation capacity than transcriptome profiling alone (Figure 1). However, before epigenomics can be used as a standard tool in assessing iPSC and hESC quality in translational applications, the appropriate resources need to be developed. For example, although ChIP-seq analysis of chromatin signatures is extremely informative, its reliance on antibody quality requires the development of renewable, standardized reagents. Also, importantly, to assess the significance of epigenomic pattern variation, sufficient numbers of reference epigenomes need to be obtained from hESC and iPSC lines that are representative of genetic variation and have been rigorously tested in a variety of differentiation assays. The first forays towards the development of such tools and resources have already been made [89, 91, 106, 107].
Annotating regulatory elements that orchestrate human differentiation and development
Cell-type-specific regulatomes as a tool for understanding the role of non-coding mutations in human disease
During the past few years, genome-wide association studies have dramatically expanded the catalog of genetic variants associated with some of the most common human disorders, such as various cancer types, type 2 diabetes, obesity, cardiovascular disease, Crohn's disease and cleft lip/palate [111, 112, 113, 114, 115, 116, 117, 118]. One recurrent observation is that most disease-associated variants occur in non-coding parts of the human genome, suggesting a large non-coding component in human phenotypic variation and disease. Indeed, several studies document a critical role for genetic aberrations occurring within individual distal enhancer elements in human pathogenesis [119, 120, 121]. To date, the role of regulatory sequence mutation in human disease has not been systematically examined. However, given the rapidly decreasing cost of high-throughput sequencing and the multiple disease-oriented whole genome sequencing projects that are under way, the next years will bring the opportunity and challenge to ascribe functional significance to disease-associated non-coding mutations . Doing so will require both an ability to identify and obtain cell types relevant to disease, and the ability to characterize their specific regulatomes.
We envision that combining pluripotent cell differentiation models with epigenomic profiling will provide an important tool for uncovering the role of non-coding mutations in human disease. For example, if the disease of interest affects a particular cell type that can be derived in vitro from hESCs, characterizing the reference regulatome of this cell type, as described above, will shrink the vast genomic regions that might be implicated in disease into a much smaller regulatory space that can be more effectively examined for recurrent variants that are associated with disease (Figure 2a). The function of these regulatory variants can be further studied using in vitro and in vivo models, of which iPSC-based 'disease in a dish' models appear particularly promising . For example, disease-relevant cell types obtained from patient-derived and healthy-donor-derived iPSCs can be used to study the effects of the disease genotype on cell-type-specific regulatomes (Figure 2b). Moreover, given that many, if not most, regulatory variants are likely to be heterozygous in patients, loss or gain of chromatin features associated with those variants (such as p300 binding, histone modifications and nucleosome occupancy) can be assayed independently for each allele within the same iPSC line. Indeed, allele-specific sequencing assays are already being developed [42, 96, 97, 124] (Table 1). Moreover, these results can be compared with allele-specific RNA-seq transcriptome analyses from the same cells , yielding insights into the effects of disease-associated regulatory alleles on the transcription of genes located in relative chromosomal proximity [96, 125].
Conclusions and future perspective
Analyses of hESC and iPSC chromatin landscapes have already provided important insights into the molecular basis of pluripotency, reprogramming and early human development. Our current view of the pluripotent cell epigenome has been largely acquired due to recent advances in next-generation sequencing technologies, such as ChIP-seq or MethylC-seq. Several chromatin features, including bivalent promoters, poised enhancers and pervasive non-CG methylation seem to be more abundant in hESCs compared with differentiated cells. It will be important in future studies to dissect the molecular function of these epigenomic attributes and their relevance for hESC biology. Epigenomic tools are also being widely used in the evaluation of iPSC identity. In general, the epigenomes of iPSC lines seem highly similar to those of hESC lines, albeit recent reports suggest that differences in DNA methylation patterns exist between the two pluripotent cell types. It will be important to understand the origins of these differences (that is, somatic memory, experimental variability, genetic variation), as well as their impact on iPSC differentiation potential or clinical applications. Moreover, additional epigenetic features other than DNA methylation should be thoroughly compared, including proper re-establishment of poised enhancer patterns. As a more complete picture of the epigenomes of ESCs, iPSCs and other cell types emerges, important lessons regarding early developmental decisions in humans will be learnt, facilitating not only our understanding of human development, but also the establishment of robust in vitro differentiation protocols. These advancements will in turn allow for generation of replacement cells for cellular transplantation approaches and for development of the appropriate 'disease in a dish' models. Within such models, epigenomic profiling could be especially helpful in understanding the genetic basis of complex human disorders, where most of the causative variants are predicted to occur within the vast non-coding fraction of the human genome.
We thank members of the Wysocka laboratory for ideas and manuscript comments. We apologize to all those authors whose work was not cited because of space limitations. JW acknowledges grant CIRM RN1 00579-1.
- 20.Bernstein BE, Stamatoyannopoulos JA, Costello JF, Ren B, Milosavljevic A, Meissner A, Kellis M, Marra MA, Beaudet AL, Ecker JR, Farnham PJ, Hirst M, Lander ES, Mikkelsen TS, Thomson JA: The NIH Roadmap Epigenomics Mapping Consortium. Nat Biotechnol. 2010, 28: 1045-1048. 10.1038/nbt1010-1045.PubMedCentralPubMedGoogle Scholar
- 21.ENCODE Project Consortium, Birney E, Stamatoyannopoulos JA, Dutta A, Guigó R, Gingeras TR, Margulies EH, Weng Z, Snyder M, Dermitzakis ET, Thurman RE, Kuehn MS, Taylor CM, Neph S, Koch CM, Asthana S, Malhotra A, Adzhubei I, Greenbaum JA, Andrews RM, Flicek P, Boyle PJ, Cao H, Carter NP, Clelland GK, Davis S, Day N, Dhami P, Dillon SC, Dorschner MO, Fiegler H: Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature. 2007, 447: 799-816. 10.1038/nature05874.Google Scholar
- 24.Frommer M, McDonald LE, Millar DS, Collis CM, Watt F, Grigg GW, Molloy PL, Paul CL: A genomic sequencing protocol that yields a positive display of 5-methylcytosine residues in individual DNA strands. Proc Natl Acad Sci USA. 1992, 89: 1827-1831. 10.1073/pnas.89.5.1827.PubMedCentralPubMedGoogle Scholar
- 33.Down TA, Rakyan VK, Turner DJ, Flicek P, Li H, Kulesha E, Gräf S, Johnson N, Herrero J, Tomazou EM, Thorne NP, Bäckdahl L, Herberth M, Howe KL, Jackson DK, Miretti MM, Marioni JC, Birney E, Hubbard TJ, Durbin R, Tavaré S, Beck S: A Bayesian deconvolution strategy for immunoprecipitation-based DNA methylome analysis. Nat Biotechnol. 2008, 26: 779-785. 10.1038/nbt1414.PubMedCentralPubMedGoogle Scholar
- 34.Fullwood MJ, Liu MH, Pan YF, Liu J, Xu H, Mohamed YB, Orlov YL, Velkov S, Ho A, Mei PH, Chew EG, Huang PY, Welboren WJ, Han Y, Ooi HS, Ariyaratne PN, Vega VB, Luo Y, Tan PY, Choy PY, Wansa KD, Zhao B, Lim KS, Leow SC, Yow JS, Joseph R, Li H, Desai KV, Thomsen JS, Lee YK, et al: An oestrogen-receptor-alpha-bound human chromatin interactome. Nature. 2009, 462: 58-64. 10.1038/nature08497.PubMedCentralPubMedGoogle Scholar
- 36.Goldberg AD, Banaszynski LA, Noh KM, Lewis PW, Elsaesser SJ, Stadler S, Dewell S, Law M, Guo X, Li X, Wen D, Chapgier A, DeKelver RC, Miller JC, Lee YL, Boydston EA, Holmes MC, Gregory PD, Greally JM, Rafii S, Yang C, Scambler PJ, Garrick D, Gibbons RJ, Higgs DR, Cristea IM, Urnov FD, Zheng D, Allis CD: Distinct factors control histone variant H3.3 localization at specific genomic regions. Cell. 2010, 140: 678-691. 10.1016/j.cell.2010.01.003.PubMedCentralPubMedGoogle Scholar
- 37.Harris RA, Wang T, Coarfa C, Nagarajan RP, Hong C, Downey SL, Johnson BE, Fouse SD, Delaney A, Zhao Y, Olshen A, Ballinger T, Zhou X, Forsberg KJ, Gu J, Echipare L, O'Geen H, Lister R, Pelizzola M, Xi Y, Epstein CB, Bernstein BE, Hawkins RD, Ren B, Chung WY, Gu H, Bock C, Gnirke A, Zhang MQ, Haussler D, et al: Comparison of sequencing-based methods to profile DNA methylation and identification of monoallelic epigenetic modifications. Nat Biotechnol. 2010, 28: 1097-1105. 10.1038/nbt.1682.PubMedCentralPubMedGoogle Scholar
- 39.Lieberman-Aiden E, van Berkum NL, Williams L, Imakaev M, Ragoczy T, Telling A, Amit I, Lajoie BR, Sabo PJ, Dorschner MO, Sandstrom R, Bernstein B, Bender MA, Groudine M, Gnirke A, Stamatoyannopoulos J, Mirny LA, Lander ES, Dekker J: Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science. 2009, 326: 289-293. 10.1126/science.1181369.PubMedCentralPubMedGoogle Scholar
- 42.Mikkelsen TS, Ku M, Jaffe DB, Issac B, Lieberman E, Giannoukos G, Alvarez P, Brockman W, Kim TK, Koche RP, Lee W, Mendenhall E, O'Donovan A, Presser A, Russ C, Xie X, Meissner A, Wernig M, Jaenisch R, Nusbaum C, Lander ES, Bernstein BE: Genome-wide maps of chromatin state in pluripotent and lineage-committed cells. Nature. 2007, 448: 553-560. 10.1038/nature06008.PubMedCentralPubMedGoogle Scholar
- 43.Schnetz MP, Handoko L, Akhtar-Zaidi B, Bartels CF, Pereira CF, Fisher AG, Adams DJ, Flicek P, Crawford GE, Laframboise T, Tesar P, Wei CL, Scacheri PC: CHD7 targets active gene enhancer elements to modulate ES cell-specific gene expression. PLoS Genet. 2010, 6: e1001023-10.1371/journal.pgen.1001023.PubMedCentralPubMedGoogle Scholar
- 54.Sha K, Boyer LA: The chromatin signature of pluripotent cells. StemBook. Edited by: Girard L. 2008, Massachusetts: Harvard Stem Cell InstituteGoogle Scholar
- 56.Bernstein BE, Mikkelsen TS, Xie X, Kamal M, Huebert DJ, Cuff J, Fry B, Meissner A, Wernig M, Plath K, Jaenisch R, Wagschal A, Feil R, Schreiber SL, Lander ES: A bivalent chromatin structure marks key developmental genes in embryonic stem cells. Cell. 2006, 125: 315-326. 10.1016/j.cell.2006.02.041.PubMedGoogle Scholar
- 58.Zhao XD, Han X, Chew JL, Liu J, Chiu KP, Choo A, Orlov YL, Sung WK, Shahab A, Kuznetsov VA, Bourque G, Oh S, Ruan Y, Ng HH, Wei CL: Whole-genome mapping of histone H3 Lys4 and 27 trimethylations reveals distinct genomic compartments in human embryonic stem cells. Cell Stem Cell. 2007, 1: 286-298. 10.1016/j.stem.2007.08.004.PubMedGoogle Scholar
- 63.Heintzman ND, Hon GC, Hawkins RD, Kheradpour P, Stark A, Harp LF, Ye Z, Lee LK, Stuart RK, Ching CW, Ching KA, Antosiewicz-Bourget JE, Liu H, Zhang X, Green RD, Lobanenkov VV, Stewart R, Thomson JA, Crawford GE, Kellis M, Ren B: Histone modifications at human enhancers reflect global cell-type-specific gene expression. Nature. 2009, 459: 108-112. 10.1038/nature07829.PubMedCentralPubMedGoogle Scholar
- 64.Heintzman ND, Stuart RK, Hon G, Fu Y, Ching CW, Hawkins RD, Barrera LO, Van Calcar S, Qu C, Ching KA, Wang W, Weng Z, Green RD, Crawford GE, Ren B: Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome. Nat Genet. 2007, 39: 311-318. 10.1038/ng1966.PubMedGoogle Scholar
- 67.Creyghton MP, Cheng AW, Welstead GG, Kooistra T, Carey BW, Steine EJ, Hanna J, Lodato MA, Frampton GM, Sharp PA, Boyer LA, Young RA, Jaenisch R: Histone H3K27ac separates active from poised enhancers and predicts developmental state. Proc Natl Acad Sci USA. 2010, 107: 21931-21936. 10.1073/pnas.1016071107.PubMedCentralPubMedGoogle Scholar
- 70.Lister R, Pelizzola M, Dowen RH, Hawkins RD, Hon G, Tonti-Filippini J, Nery JR, Lee L, Ye Z, Ngo QM, Edsall L, Antosiewicz-Bourget J, Stewart R, Ruotti V, Millar AH, Thomson JA, Ren B, Ecker JR: Human DNA methylomes at base resolution show widespread epigenomic differences. Nature. 2009, 462: 315-322. 10.1038/nature08514.PubMedCentralPubMedGoogle Scholar
- 75.Koh KP, Yabuuchi A, Rao S, Huang Y, Cunniff K, Nardone J, Laiho A, Tahiliani M, Sommer CA, Mostoslavsky G, Lahesmaa R, Orkin SH, Rodig SJ, Daley GQ, Rao A: Tet1 and Tet2 regulate 5-hydroxymethylcytosine production and cell lineage specification in mouse embryonic stem cells. Cell Stem Cell. 2011, 8: 200-213. 10.1016/j.stem.2011.01.008.PubMedCentralPubMedGoogle Scholar
- 77.Ficz G, Branco MR, Seisenberger S, Santos F, Krueger F, Hore TA, Marques CJ, Andrews S, Reik W: Dynamic regulation of 5-hydroxymethylcytosine in mouse ES cells and during differentiation. Nature. 2011, in pressGoogle Scholar
- 79.Hawkins RD, Hon GC, Lee LK, Ngo Q, Lister R, Pelizzola M, Edsall LE, Kuan S, Luu Y, Klugman S, Antosiewicz-Bourget J, Ye Z, Espinoza C, Agarwahl S, Shen L, Ruotti V, Wang W, Stewart R, Thomson JA, Ecker JR, Ren B: Distinct epigenomic landscapes of pluripotent and lineage-committed human cells. Cell Stem Cell. 2010, 6: 479-491. 10.1016/j.stem.2010.03.018.PubMedCentralPubMedGoogle Scholar
- 82.Peric-Hupkes D, Meuleman W, Pagie L, Bruggeman SW, Solovei I, Brugman W, Gräf S, Flicek P, Kerkhoven RM, van Lohuizen M, Reinders M, Wessels L, van Steensel B: Molecular maps of the reorganization of genome-nuclear lamina interactions during differentiation. Mol Cell. 2010, 38: 603-613. 10.1016/j.molcel.2010.03.016.PubMedGoogle Scholar
- 90.Lister R, Pelizzola M, Kida YS, Hawkins RD, Nery JR, Hon G, Antosiewicz-Bourget J, O'Malley R, Castanon R, Klugman S, Downes M, Yu R, Stewart R, Ren B, Thomson JA, Evans RM, Ecker JR: Hotspots of aberrant epigenomic reprogramming in human induced pluripotent stem cells. Nature. 2011, 471: 68-73. 10.1038/nature09798.PubMedCentralPubMedGoogle Scholar
- 91.Bock C, Kiskinis E, Verstappen G, Gu H, Boulting G, Smith ZD, Ziller M, Croft GF, Amoroso MW, Oakley DH, Gnirke A, Eggan K, Meissner A: Reference Maps of human ES and iPS cell variation enable high-throughput characterization of pluripotent cell lines. Cell. 2011, 144: 439-452. 10.1016/j.cell.2010.12.032.PubMedCentralPubMedGoogle Scholar
- 92.Chin MH, Mason MJ, Xie W, Volinia S, Singer M, Peterson C, Ambartsumyan G, Aimiuwu O, Richter L, Zhang J, Khvorostov I, Ott V, Grunstein M, Lavon N, Benvenisty N, Croce CM, Clark AT, Baxter T, Pyle AD, Teitell MA, Pelegrini M, Plath K, Lowry WE: Induced pluripotent stem cells and embryonic stem cells are distinguished by gene expression signatures. Cell Stem Cell. 2009, 5: 111-123. 10.1016/j.stem.2009.06.008.PubMedCentralPubMedGoogle Scholar
- 94.Deng J, Shoemaker R, Xie B, Gore A, LeProust EM, Antosiewicz-Bourget J, Egli D, Maherali N, Park IH, Yu J, Daley GQ, Eggan K, Hochedlinger K, Thomson J, Wang W, Gao Y, Zhang K: Targeted bisulfite sequencing reveals changes in DNA methylation associated with nuclear reprogramming. Nat Biotechnol. 2009, 27: 353-360. 10.1038/nbt.1530.PubMedCentralPubMedGoogle Scholar
- 97.McDaniell R, Lee BK, Song L, Liu Z, Boyle AP, Erdos MR, Scott LJ, Morken MA, Kucera KS, Battenhouse A, Keefe D, Collins FS, Willard HF, Lieb JD, Furey TS, Crawford GE, Iyer VR, Birney E: Heritable individual-specific and allele-specific chromatin signatures in humans. Science. 2010, 328: 235-239. 10.1126/science.1184655.PubMedCentralPubMedGoogle Scholar
- 98.Kim K, Doi A, Wen B, Ng K, Zhao R, Cahan P, Kim J, Aryee MJ, Ji H, Ehrlich LI, Yabuuchi A, Takeuchi A, Cunniff KC, Hongguang H, McKinney-Freeman S, Naveiras O, Yoon TJ, Irizarry RA, Jung N, Seita J, Hanna J, Murakami P, Jaenisch R, Weissleder R, Orkin SH, Weissman IL, Feinberg AP, Daley GQ: Epigenetic memory in induced pluripotent stem cells. Nature. 2010, 467: 285-290. 10.1038/nature09342.PubMedCentralPubMedGoogle Scholar
- 99.Polo JM, Liu S, Figueroa ME, Kulalert W, Eminli S, Tan KY, Apostolou E, Stadtfeld M, Li Y, Shioda T, Natesan S, Wagers AJ, Melnick A, Evans T, Hochedlinger K: Cell type of origin influences the molecular and functional properties of mouse induced pluripotent stem cells. Nat Biotechnol. 2010, 28: 848-855. 10.1038/nbt.1667.PubMedCentralPubMedGoogle Scholar
- 100.Doi A, Park IH, Wen B, Murakami P, Aryee MJ, Irizarry R, Herb B, Ladd-Acosta C, Rho J, Loewer S, Miller J, Schlaeger T, Daley GQ, Feinberg AP: Differential methylation of tissue- and cancer-specific CpG island shores distinguishes human induced pluripotent stem cells, embryonic stem cells and fibroblasts. Nat Genet. 2009, 41: 1350-1353. 10.1038/ng.471.PubMedCentralPubMedGoogle Scholar
- 104.Guttman M, Amit I, Garber M, French C, Lin MF, Feldser D, Huarte M, Zuk O, Carey BW, Cassady JP, Cabili MN, Jaenisch R, Mikkelsen TS, Jacks T, Hacohen N, Bernstein BE, Kellis M, Regev A, Rinn JL, Lander ES: Chromatin signature reveals over a thousand highly conserved large non-coding RNAs in mammals. Nature. 2009, 458: 223-227. 10.1038/nature07672.PubMedCentralPubMedGoogle Scholar
- 106.Egelhofer TA, Minoda A, Klugman S, Lee K, Kolasinska-Zwierz P, Alekseyenko AA, Cheung MS, Day DS, Gadel S, Gorchakov AA, Gu T, Kharchenko PV, Kuan S, Latorre I, Linder-Basso D, Luu Y, Ngo Q, Perry M, Rechtsteiner A, Riddle NC, Schwartz YB, Shanower GA, Vielle A, Ahringer J, Elgin SC, Kuroda MI, Pirrotta V, Ren B, Strome S, Park PJ, et al: An assessment of histone-modification antibody quality. Nat Struct Mol Biol. 2011, 18: 91-93. 10.1038/nsmb.1972.PubMedCentralPubMedGoogle Scholar
- 112.Barrett JC, Hansoul S, Nicolae DL, Cho JH, Duerr RH, Rioux JD, Brant SR, Silverberg MS, Taylor KD, Barmada MM, Bitton A, Dassopoulos T, Datta LW, Green T, Griffiths AM, Kistner EO, Murtha MT, Regueiro MD, Rotter JI, Schumm LP, Steinhart AH, Targan SR, Xavier RJ, NIDDK IBD Genetics Consortium, Libioulle C, Sandor C, Lathrop M, Belaiche J, Dewit O, Gut I: Genome-wide association defines more than 30 distinct susceptibility loci for Crohn's disease. Nat Genet. 2008, 40: 955-962. 10.1038/ng.175.PubMedCentralPubMedGoogle Scholar
- 113.Mangold E, Ludwig KU, Birnbaum S, Baluardo C, Ferrian M, Herms S, Reutter H, de Assis NA, Chawa TA, Mattheisen M, Steffens M, Barth S, Kluck N, Paul A, Becker J, Lauster C, Schmidt G, Braumann B, Scheer M, Reich RH, Hemprich A, Pötzsch S, Blaumeiser B, Moebus S, Krawczak M, Schreiber S, Meitinger T, Wichmann HE, Steegers-Theunissen RP, Kramer FJ: Genome-wide association study identifies two susceptibility loci for nonsyndromic cleft lip with or without cleft palate. Nat Genet. 2010, 42: 24-26. 10.1038/ng.506.PubMedGoogle Scholar
- 114.Diabetes Genetics Initiative of Broad Institute of Harvard and MIT, Lund University, and Novartis Institutes of BioMedical Research, Saxena R, Voight BF, Lyssenko V, Burtt NP, de Bakker PI, Chen H, Roix JJ, Kathiresan S, Hirschhorn JN, Daly MJ, Hughes TE, Groop L, Altshuler D, Almgren P, Florez JC, Meyer J, Ardlie K, Bengtsson Boström K, Isomaa B, Lettre G, Lindblad U, Lyon HN, Melander O, Newton-Cheh C, Nilsson P, Orho-Melander M, Råstam L, Speliotes EK, Taskinen MR: Genome-wide association analysis identifies loci for type 2 diabetes and triglyceride levels. Science. 2007, 316: 1331-1336.Google Scholar
- 115.Scott LJ, Mohlke KL, Bonnycastle LL, Willer CJ, Li Y, Duren WL, Erdos MR, Stringham HM, Chines PS, Jackson AU, Prokunina-Olsson L, Ding CJ, Swift AJ, Narisu N, Hu T, Pruim R, Xiao R, Li XY, Conneely KN, Riebow NL, Sprau AG, Tong M, White PP, Hetrick KN, Barnhart MW, Bark CW, Goldstein JL, Watkins L, Xiang F, Saramies J: A genome-wide association study of type 2 diabetes in Finns detects multiple susceptibility variants. Science. 2007, 316: 1341-1345. 10.1126/science.1142382.PubMedCentralPubMedGoogle Scholar
- 116.Scuteri A, Sanna S, Chen WM, Uda M, Albai G, Strait J, Najjar S, Nagaraja R, Orrú M, Usala G, Dei M, Lai S, Maschio A, Busonero F, Mulas A, Ehret GB, Fink AA, Weder AB, Cooper RS, Galan P, Chakravarti A, Schlessinger D, Cao A, Lakatta E, Abecasis GR: Genome-wide association scan shows genetic variants in the FTO gene are associated with obesity-related traits. PLoS Genet. 2007, 3: e115-10.1371/journal.pgen.0030115.PubMedCentralPubMedGoogle Scholar
- 117.Thomas G, Jacobs KB, Yeager M, Kraft P, Wacholder S, Orr N, Yu K, Chatterjee N, Welch R, Hutchinson A, Crenshaw A, Cancel-Tassin G, Staats BJ, Wang Z, Gonzalez-Bosquet J, Fang J, Deng X, Berndt SI, Calle EE, Feigelson HS, Thun MJ, Rodriguez C, Albanes D, Virtamo J, Weinstein S, Schumacher FR, Giovannucci E, Willett WC, Cussenot O, Valeri A, et al: Multiple loci identified in a genome-wide association study of prostate cancer. Nat Genet. 2008, 40: 310-315. 10.1038/ng.91.PubMedGoogle Scholar
- 118.Thomas G, Jacobs KB, Yeager M, Kraft P, Wacholder S, Orr N, Yu K, Chatterjee N, Welch R, Hutchinson A, Crenshaw A, Cancel-Tassin G, Staats BJ, Wang Z, Gonzalez-Bosquet J, Fang J, Deng X, Berndt SI, Calle EE, Feigelson HS, Thun MJ, Rodriguez C, Albanes D, Virtamo J, Weinstein S, Schumacher FR, Giovannucci E, Willett WC, Cussenot O, Valeri A: Genome-wide association scan identifies a colorectal cancer susceptibility locus on chromosome 8q24. Nat Genet. 2007, 39: 989-994. 10.1038/ng2089.PubMedGoogle Scholar
- 124.Kasowski M, Grubert F, Heffelfinger C, Hariharan M, Asabere A, Waszak SM, Habegger L, Rozowsky J, Shi M, Urban AE, Hong MY, Karczewski KJ, Huber W, Weissman SM, Gerstein MB, Korbel JO, Snyder M: Variation in transcription factor binding among humans. Science. 2010, 328: 232-235. 10.1126/science.1183621.PubMedCentralPubMedGoogle Scholar