The genetics of acute myeloid leukemia

Acute myeloid leukemia (AML) is an aggressive malignancy of the bone marrow with a 5-year overall survival of approximately 30% [1, 2]. Recent advances in next-generation sequencing have allowed us to better understand the genetics of AML and led to the identification of many recurrently mutated genes implicated in the pathogenesis of this disease [3]. However, despite these powerful tools, our fundamental understanding of leukemogenesis remains incomplete. In particular, little is known about the founding mutations in AML. In previous work, our laboratory has proposed a model by which AML develops through the serial acquisition of mutations in long-lived, self-renewing hematopoietic stem cells (HSCs). Through the prospective isolation of rare residual HSCs, we have shown that these cells harbor “pre-leukemic” founder mutations, but lack the complete set of mutations required to generate AML [4, 5].

Studying these cells, which we term “pre-leukemic HSCs”, has provided insights into the early progression of AML and suggests that pre-leukemic HSCs may be an important therapeutic target to ensure more durable clinical remissions. In particular, we have identified patterns of mutational acquisition in AML where the earliest mutations occur in genes referred to as “landscaping genes”, involved primarily in regulation of the epigenome, while late mutations occur primarily in “proliferative” genes, involved in activated signaling. More specifically, mutations in genes involved in processes such as DNA methylation, histone modification, and chromatin accessibility and conformation were highly enriched in pre-leukemic cells. In contrast, mutations in NPM1 or in genes involved in activated signaling, such as FLT3 or KRAS, were significantly absent in pre-leukemic cells. The acquisition of these late mutations also resulted in, or was accompanied by, a loss of “stemness”, a specification down the myeloid lineage, and a change in immunophenotype. Importantly, as the majority of self-renewing leukemic cells in most patients have a progenitor phenotype, the late occurring mutations that lead to final leukemic transformation may occur in progenitor cells. These observations support a model for leukemogenesis where mutations in landscaping genes occur in HSCs early in evolutionary time, which prepares these pre-leukemic HSCs or downstream progenitors to transform upon the acquisition of additional proliferative mutations [5].

Recently, several large cohort analyses have further explored this question of founding mutations in AML and other hematologic malignancies by examining peripheral blood exome sequences from non-leukemia patient cohorts [69]. These studies identified evidence of clonal hematopoiesis in elderly individuals in association with mutations in landscaping genes commonly mutated in AML including DNMT3A, TET2, ASXL1, and the cohesin complex. Importantly, individuals with such mutated clonal hematopoiesis had an increased risk for subsequent development of a hematologic malignancy.

Phenotypic consequences of cohesin complex mutations in HSPCs

Recurrent mutations in the cohesin complex, composed of four core components STAG1/2, RAD21, SMC1A, and SMC3, have been identified in myelodysplastic syndrome (MDS), AML, and other myeloid malignancies through large-scale re-sequencing efforts [10]. Cohesin mutations are almost always mutually exclusive, suggesting that an alteration in one component may be sufficient to affect the entire complex, and collectively occur in approximately 15% of AML cases and other myeloid malignancies [3, 1114]. Furthermore, work on pre-leukemic HSCs, described above, has shown that cohesin mutations often occur early in leukemogenesis [5], and complimentary studies have shown that cohesin mutations are observed in MDS that transforms into secondary AML, indicating that these mutations are stable and likely important to the pathogenesis of the disease [15].

The cohesin complex functions to hold chromatin strands within a ring-like structure composed of the four-core components [16]. Although its canonical and best-established role is to maintain the polarity of sister chromatids during mitosis, cohesin is also involved in double-stranded DNA damage repair and regulation of transcription [17]. In particular, cohesin can regulate transcription factor recruitment to chromatin, and during mitosis, retention of cohesin at transcription factor cluster sites contributes to transcriptional memory in proliferating cells [18]. This function is often referred to as mitotic bookmarking, which can also be mediated by tissue-specific mitotic bookmarks such as the GATA-family of hematopoietic transcription factors or other pioneer transcription factors [1921]. Additional functions of cohesin involve its interaction with the transcriptional repressor CTCF to mediate chromatin topology [22]. The discovery of recurrent mutations in the cohesin complex indicates another pathway that is relevant for AML, yet the manner in which they contribute to pathogenesis has only recently been interrogated.

Interestingly, cohesin complex mutations have previously been identified in multiple cancer types, including glioblastoma multiforme [23], Ewing sarcoma [24], colorectal carcinoma [25], and bladder carcinoma [26, 27]. Conflicting reports [25, 27] have reported that cohesin mutations may or may not be associated with aneuploidy in bladder cancer, implicating defects in its canonical role in maintaining sister chromatid cohesion during mitosis as a mechanism for malignant transformation. Notably, in most AML cases, cohesin mutations are not associated with karyotypic abnormalities [3, 12], suggesting that defects in chromatid cohesion do not contribute to leukemogenesis.

A set of four recent studies sought to elucidate the phenotypic consequences of cohesin mutations and loss of cohesin on leukemogenesis in human and mouse models (Table 1) [2831]. Results from our lab showed that introduction of mutant cohesin into AML cell lines and primary human cord blood hematopoietic stem and progenitor cells (HSPCs) resulted in a differentiation block with an increased frequency of CD34+ cord blood progenitor cells. A similar phenotype was observed with knockdown of RAD21 both in vitro and in vivo, indicating that mutant cohesin can act either through haploinsufficiency or dominant-negative mechanisms. Mutant cohesin increased the serial replating ability of human HSPCs in vitro and showed enrichment of HSC and leukemia stem cell gene expression signatures, indicating an effect to enforce stem cell functions. Interestingly, the effect of mutant cohesin was found to be cell context dependent, in that these phenotypes could only be observed in the most immature HSPC populations [28].

Table 1 Summary of cohesin mutation/knockdown studies in hematopoiesis

These results were consistent with an RNAi screen conducted by Galeev et al. targeted to primary human HSPCs to identify critical modifiers of self-renewal and differentiation. This screen identified several members of the cohesin complex (RAD21, SMC3, and STAG1/2), and showed that cohesin deficiency induces HSC-specific gene programs. Furthermore, cohesin-deficient HSPCs showed increased reconstitution potential in primary and secondary transplantation studies. [29].

Complementary work in mouse models showed that depletion of cohesin subunits increased in vitro replating efficiency and led to myeloid-skewed differentiation, consistent with the phenotype seen in cohesin mutant human HSPCs [30, 31]. In particular, the Levine lab showed a dose-dependent role for SMC3 in regulating chromatin structure and HSPC function. Biallelic loss of SMC3 in mice induced bone marrow aplasia with premature sister chromatid separation and revealed an absolute requirement for cohesin in HSPC function; whereas, SMC3 haploinsufficiency increased self-renewal in vitro and in vivo. Furthermore, analogous to results seen in human cells, SMC3 haploinsufficiency reduced expression of transcription factors and other genes associated with hematopoietic lineage commitment. Finally, combination of SMC3 knockdown with FLT3-ITD resulted in AML development in vivo [30].

A similar approach was taken by the Aifantis lab, which generated a series of inducible cohesin-shRNA mouse models targeting each of the four subunits. Knockdown of cohesin complex members led to gain of replating capacity of mouse HSPCs. Furthermore, aged cohesin knockdown mice developed a clinical picture closely resembling myeloproliferative neoplasms (MPNs), indicating that cohesin mutations can occur as an early event in leukemogenesis [31], as seen in human cells.

Mechanistic consequences of cohesin complex mutations in AML

Recent published work on cohesin mutant and loss of cohesin models implicate a link between the role of cohesin in the regulation of transcription through the modulation of chromatin accessibility and the maintenance of stem cell functions in HSPCs [28, 30, 31]. Consistent with a previous report [18], using an assay for transposase-accessible chromatin with high-throughput sequencing (ATAC-seq), our lab observed a global loss of open chromatin in cohesin mutant-expressing cells, but an increase in accessibility at specific motifs for key hematopoietic transcription factors including ERG, GATA2, and RUNX1 that in turn execute stem cell transcriptional programs and phenotypes. We further showed that knockdown of GATA2 or RUNX1 in human HSPCs reverted the differentiation block induced by mutant cohesin [28]. These results support a model in which mutant cohesin impairs hematopoietic differentiation and enforces stem cell programs through the modulation of GATA2 and RUNX1 chromatin accessibility, expression, and activity. This was consistent with results seen in cohesin-deficient mouse models where widespread alterations in chromatin accessibility were apparent [30, 31]. However, further studies will be needed to better define the loci affected by cohesin mutants or deficiency.

Several mechanisms are possible to explain these results. One model proposes that the global loss of open chromatin observed in cohesin mutant-expressing cells enriches, either directly or indirectly, the activity of pioneer factors that can bind chromatin in the absence of the cohesin complex (Fig. 1). Previous papers have suggested that GATA2 [32], RUNX1 [33], and ERG [34] can all function as pioneer factors in hematopoietic cells. This results in enrichment of open motifs for these factors and higher likelihood of occupancy in cohesin mutant-expressing cells as detected in our studies by ATAC-Seq and ChIP-Seq, with a decrease in accessibility for motifs of non-pioneer transcription factors that promote hematopoietic differentiation. The result is impaired hematopoietic differentiation and enforcement of stem cell programs in HSPCs.

Fig. 1
figure 1

Potential mechanisms mediating the cohesin mutant phenotype. Cohesin has been shown to act as a mitotic bookmark during the cell cycle remaining bound to chromosomes and directing transcription factors back to their sites after mitosis. Loss of cohesin may enrich for activity of pioneer factors such as ERG, GATA2, and RUNX1, or tissue-specific mitotic bookmarks that remain bound through the cell cycle, and drive stem cell programs

A second, related, model derives from the interplay between cohesin-dependent and tissue-specific mitotic bookmarking (Fig. 1) [21]. Mitotic bookmarking is the process of marking active regulatory elements by some mechanism during S and M phases of the cell cycle when the vast majority of DNA-binding proteins are disengaged from DNA. Cohesin has been shown to be an important mitotic bookmark for several transcription factors. In cohesin mutant-expressing cells, mitotic bookmarking could be skewed towards tissue-specific processes, which in hematopoietic cells have been shown to involve GATA transcription factors [19]. Thus, cohesin mutant-expressing HSPCs in the process of establishing transcription factor complexes that promote differentiation lose these complexes when they undergo mitosis. This enriches tissue-specific mitotic bookmarks, such as GATA factors, that in turn drive HSPC transcriptional programs. Future studies using techniques developed for low cell numbers will be needed to determine if cohesin is indeed lost at sites of hematopoietic differentiation in human HSPCs.

A third possible mechanism is based on cohesin’s role in maintaining chromatin architecture along with CTCF. Cohesin ChIA-PET studies in mouse ES cells have shown that many cell identity genes, including GATA2, are found within chromosome structures that are formed by the looping of two interacting CTCF sites co-occupied by cohesin [35]. These looped structures form insulated neighborhoods whose integrity is important for proper expression of local genes, and it has been suggested that loss of cohesin would lead to de-repression of these genes (Fig. 2).

Fig. 2
figure 2

Potential effect of cohesin mutants on chromatin architecture. Wild-type cohesin is known to work with CTCF to insulate genes in neighborhood known as topologically associated domains (TADs) through chromatin looping. Loss of cohesin at these TADs may lead to de-repression of HSPC genes such as GATA2

Although a direct link between mutations in the cohesin complex, chromatin landscape changes and AML leukemogenesis remains to be established, these recent studies provide the first mechanistic insights into how cohesin mutations impact normal HSPC function [28, 30, 31]. Although the chromatin topology changes in cohesin mutant cells may not be leukemogenic in stem cells, the inability to shut down stem cell programs during differentiation may drive transformation with a secondary leukemogenic event, as seen in the SMC3 knockdown-FLT3-ITD mouse model. These results are consistent with a model of mutational acquisition in AML that we have previously proposed [5], in which pre-leukemic mutations occur in genes involved in global regulation of gene expression through epigenetic mechanisms that impair differentiation and/or affect self-renewal (such as IDH1/2, TET2, DNMT3A, and cohesin), whereas late mutations occur in genes that generally lead to an increase in activated signaling and proliferation (such as FLT3 and RAS).

In conclusion, cohesin mutants impair hematopoietic differentiation and enforce stem cell programs in human and mouse models of hematopoiesis, and occur as pre-leukemic and early mutations in human AML and myeloid malignancies (Fig. 3). These observations indicate that cohesin mutants are excellent targets for novel therapeutics with the potential to change disease outcomes. Understanding the mechanisms by which these mutants act will be critical to the identification and design of such novel targeted agents. Furthermore, as cohesin mutations are found in a variety of human cancers, it will be important to determine if cohesin complex mutations impart the same phenotype observed in hematopoietic cells in these other tissues and cellular contexts.

Fig. 3
figure 3

Schematic of cohesin mutants in leukemogenesis. Mutations in subunits of the cohesin complex occur early in leukemogenesis and lead to increases in chromatin accessibility, notably at sites of HSPC transcription factor binding, which in turn leads to enforcement of stem cell programs. Late acquisition of proliferative mutations such as FLT3-ITD, cooperate with cohesin mutations to cause fulminant leukemia