Integrated epigenomic analysis stratifies chromatin remodellers into distinct functional groups
ATP-dependent chromatin remodelling complexes are responsible for establishing and maintaining the positions of nucleosomes. Chromatin remodellers are targeted to chromatin by transcription factors and non-coding RNA to remodel the chromatin into functional states. However, the influence of chromatin remodelling on shaping the functional epigenome is not well understood. Moreover, chromatin remodellers have not been extensively explored as a collective group across two-dimensional and three-dimensional epigenomic layers.
Here, we have integrated the genome-wide binding profiles of eight chromatin remodellers together with DNA methylation, nucleosome positioning, histone modification and Hi-C chromosomal contacts to reveal that chromatin remodellers can be stratified into two functional groups. Group 1 (BRG1, SNF2H, CHD3 and CHD4) has a clear preference for binding at ‘actively marked’ chromatin and Group 2 (BRM, INO80, SNF2L and CHD1) for ‘repressively marked’ chromatin. We find that histone modifications and chromatin architectural features, but not DNA methylation, stratify the remodellers into these functional groups.
Our findings suggest that chromatin remodelling events are synchronous and that chromatin remodellers themselves should be considered simultaneously and not as individual entities in isolation or necessarily by structural similarity, as they are traditionally classified. Their coordinated function should be considered by preference for chromatin features in order to gain a more accurate and comprehensive picture of chromatin regulation.
KeywordsChromatin Nucleosome Chromatin remodelling Enhancer Promoter Gene regulation Epigenetics CHD SWI/SNF INO80 ISWI
Brahma-related gene 1
chromodomain helicase DNA binding
chromatin multivariate hidden Markov model
histone 3 lysine 27 acetylation
histone 3 lysine 27 trimethylation
histone 3 lysine 4 monomethylation
histone 3 lysine 4 trimethylation
histone 3 lysine 9 trimethylation
high-throughput chromosome conformation capture
inositol requiring 80 complex
sucrose non-fermenting 2
Chromatin is a dynamic and multi-layered structure of which the core building block is the nucleosome. Nucleosomes are comprised of an octamer of histone proteins and 147 base pairs (bp) of DNA in approximately two helical turns . The unique chromatin conformation of any given cell is typically maintained throughout divisions and serves as a physical barrier to transcription factors and other regulatory proteins in order to prevent promiscuous gene expression [2, 3]. Thus, chromatin structure must be modulated for regulatory factors to access DNA when required. This is largely achieved through the movement of nucleosomes by ATP-dependent chromatin remodelling complexes, which utilise ATP hydrolysis to organise nucleosomes into an active (relaxed) or repressive (compact) conformation. The process of chromatin remodelling provides means for regulating DNA structure with precision and accuracy to facilitate diverse cellular processes including transcriptional regulation, DNA repair, DNA replication and cell cycle progression [2, 4]. However, despite growing research into the molecular and biochemical mechanisms of chromatin remodelling, it is still not completely understood how remodelling complexes work together to position nucleosomes for the required chromatin function.
Beyond the physical nature of its structure, chromatin carries diverse gene regulatory information including post-translational modifications of histone proteins and DNA methylation [2, 4, 5]. These features, together with non-coding RNA (ncRNA) species, form the epigenome. Chromatin remodellers are drawn to their target regions by sequence-specific regulatory proteins or ncRNAs [6, 7, 8] and use their protein structural domains to identify epigenetic patterning and the ‘linker’ DNA between nucleosomes to identify their preferred nucleosome substrate [3, 9, 10, 11]. Yet, it is important to consider that many regulatory regions of the genome are a composite of multiple epigenetic marks; therefore, there is a multifaceted relationship between chromatin remodellers and the epigenome. For example, bivalent promoters are characterised by the trimethylation of histone 3 at both lysine 4 (H3K4me3) and lysine 27 (H3K27me3) [12, 13, 14] and could therefore potentially be ‘read’ by remodellers recognising either of these marks. Additionally, these relationships are not linear as more than one remodeller may recognise a single histone modification . Uncovering the extent of overlapping and unique activity of chromatin remodeller proteins is of great interest and is essential for understanding the influence of the epigenetic signature on chromatin remodelling events at any given locus.
Here, we have examined the binding profiles of eight different chromatin remodeller proteins and integrated these with extensive epigenomic data, including histone modifications, DNA methylation and chromosome architectural information. Our study reveals that chromatin remodellers can be stratified into two groups based on their binding enrichment at either ‘actively marked’ or ‘repressively marked’ regions and their interactions with chromatin features of the epigenome.
The degree of chromatin remodeller binding correlates with the remodeller gene expression levels in prostate cancer cells
To improve our understanding of the relationship between different chromatin remodellers, we sought to examine the genomic binding profiles of multiple chromatin remodeller proteins simultaneously from publicly available data obtained from LNCaP prostate cancer cells . All eight remodellers examined are catalytic ATPases that form mutually exclusive complexes and together represent all four the structural families, unlike previous studies in mouse embryonic stem cells. The remodellers studied were BRG1 and BRM from the SWI/SNF family; SNF2H and SNF2L from the ISWI family; INO80 from the INO80-like family; and CHD1, CHD3 and CHD4 from the CHD family (Fig. 1a). We first assessed the gene expression level of each remodeller using published RNA-seq data from LNCaP cells  and compared this to prostate tumour samples from The Cancer Genome Atlas (TCGA) database  to determine the validity of using LNCaP cells as a prostate cancer model to explore chromatin remodellers. We found a high concordance in the gene expression pattern for each remodeller between these data sets, where those genes displaying higher expression in clinical samples from TCGA (n = 486) also had higher expression in the LNCaP cell line (Additional file 1: Figure S1A–B). A similar comparison was made between a normal prostate epithelial cell line (PrEC) and the normal samples from TCGA (n = 52); the expression patterns of the remodellers in PrEC cells mirrored that of the normal clinical samples (Additional file 1: Figure S1C–D). We calculated the Pearson correlation coefficient and found a strong positive linear relationship between the LNCaP cells and the TCGA tumours (Pearson R =0.8375552) and between the PrEC cells and the TCGA normal data sets (Pearson R =0.670013; Additional file 1: Figure S1E). The exception to this was SNF2L, which was significantly lower in LNCaP cells compared to the TCGA tumours.
Using the ChIP-seq binding profiles of the eight chromatin remodellers from Ye et al. , we next examined the number of individual binding sites for each remodeller and established they ranged from 712 for INO80 to 39,887 for SNF2H (Fig. 1b), and in total, there were 60,043 genomic regions bound by at least one chromatin remodeller. Upon visualising the remodeller binding sites, we observed several regions where multiple remodellers were bound, such as the intergenic region between DUSP22 and IRF4 on chromosome 6 and upstream of the EBLN2 gene on chromosome 3 (Fig. 1c). Additionally, there were several sites where a single remodeller was bound, such as CHD3 at the EBLN2 gene promoter and SNF2H within the DAB2IP gene (Fig. 1c). We observed that the majority of the binding sites occurred over regions marked by gene regulatory histone modifications. Over 75% of the remodeller binding sites were under 750 bp in size, corresponding to a span of ~ 1–5 nucleosomes (Additional file 1: Figure S2A–H), covering 0.93% of the genome. The degree of overlap between the remodellers varied extensively with the number of unique binding sites for each remodeller ranging from ~ 25 to ~ 50%, with the exception of SNF2H which had ~ 75% unique sites, likely paralleling the overall large number of binding sites for this remodeller (Fig. 1d).
Together, these data show that there is large variation in the number of genomic sites occupied by chromatin remodellers, with both independent and a high degree of overlapping activity between the remodeller proteins. Importantly, the high overlap in the number of binding sites containing at least two remodellers suggests there is widespread coordinated activity across the genome.
The epigenome stratifies chromatin remodellers into two groups: Group 1 is associated with ‘actively marked’ chromatin and Group 2 with ‘repressively marked’ chromatin
As our data show that the majority of chromatin remodeller binding occurs at histone ‘marked’ chromatin, we next examined the enrichment of remodeller binding across gene regulatory features. We specifically targeted ‘actively marked’ and ‘repressively marked’ regulatory elements that are ‘marked’ with a composite of histone modifications. To define these regulatory chromatin features, we used annotated transcription start sites (TSS) from GENCODE and classified the promoter as a 2 kb region surrounding the TSS. We then separated these promoters according to the histone modifications present in LNCaP cells to classify them as active (H3K4me3 and H3K27ac), facultative repressed (H3K27me3), constitutively repressed (H3K9me3) and bivalent (H3K4me3 and H3K27me3) (Fig. 2b). Additionally, we defined putative enhancers as being at least 2 kb away from an annotated TSS and classified them as either active (H3K4me1, H3K27ac, p300 and DNaseI accessible; Additional file 1: Figure S3C) or poised (H3K4me1) (Fig. 2b), and analysed the chromatin remodellers at these regions.
We assessed whether each chromatin remodeller was bound at the gene regulatory features defined above, more (enriched) or less (depleted) than expected by chance using the Genome Association Tester (GAT) . All chromatin remodeller proteins were significantly enriched at ‘actively marked’ regulatory elements, (Fig. 2c–e; Additional file 1: Figure S4A). However, it was notable that they could be stratified into two distinct groups based on the level of enrichment at these regions. BRG1, SNF2H, CHD3 and CHD4, herein called Group 1, were significantly more enriched compared to BRM, INO80, SNF2L and CHD1, herein called Group 2 (Student’s unpaired T test, active promoters p = 0.04; active enhancers p = 0.0001; poised enhancers p = 0.0004). At gene regulatory regions marked by ‘repressively marked’ epigenetic features, the facultative and constitutively repressed promoters, the majority of Group 1 remodellers were not significantly enriched, in contrast to Group 2 (Fig. 2f, g). Concomitantly, there was a significant difference between the mean enrichment scores of Group 1 and Group 2 remodellers at repressed regions (Student’s unpaired T-test, facultative repressed promoters p = 0.016; constitutively repressed promoters p = 0.02). However, statistical enrichment at a particular genomic annotated feature does not determine the extent of direct overlap of remodeller binding sites at these features. By examining the direct overlap within each group of remodellers, we found that there are common and unique binding sites (Additional file 1: Figure S4B-C); for Group 1 binding sites, only 22.6% contained two or more Group 1 remodellers (Additional file 1: Figure S4D), and for Group 2 binding sites, only 17.3% contained more than one Group 2 remodellers (Additional file 1: Figure S4E). Example regions bound by all Group 1 remodellers are illustrated upstream of the SDC1 promoter and at the COPS4 promoter, and adjacent to exon 2 of the COPS4 gene for Group 2 (Fig. 2h). Therefore, while each group of remodellers was enriched at the same genomic features (i.e. binding occurs higher than expected by chance), there are only a small percentage of regions where all the remodellers bind together, highlighting varied roles for these remodellers.
We next examined bivalent chromatin, which exhibits both active and repressive epigenetic features. Given the above results, we hypothesised that both Group 1 and Group 2 remodellers would be enriched equally at bivalent chromatin. We indeed found that all remodellers were significantly enriched at bivalent promoters, and notably, we observed that Group 2 remodellers had significantly higher enrichment compared to Group 1 (Fig. 2i; Student’s unpaired T-test, p =0.0004). This is in line with previous research showing low enrichment of BRG1 and CHD4 at bivalent chromatin .
We next performed the equivalent enrichment analysis using the chromHMM 15 state model (see “Methods”). Again, Group 1 remodellers had a mean enrichment score significantly higher than Group 2 at ‘actively marked’ regions and a lower score for ‘repressively marked’ regions (Additional file 1: Figure S5A-I), with the exception of intragenic enhancers. All remodellers were enriched across all three bivalent states in the model (bivalent promoter, flanking bivalent promoter and bivalent enhancer), but there was no significant difference between the Group 1 and Group 2 (Additional file 1: Figure S5 J-L; Student’s unpaired T-test, p =0.118). The chromHMM model also defines states of active transcription, and we found that there is no significant enrichment of remodellers in the regions flanking active transcription and all remodellers were significantly depleted from regions of active transcription (Additional file 1: Figure S5M–O). Moreover, as there is also no significant enrichment at intragenic enhancers (Additional file 1: Figure S5D), we speculated that sites of active transcription (including the intragenic enhancers) may be due to the highly dynamic and transient nature of transcription, preventing a stable signal of the chromatin remodellers from being detected within these regions.
An interesting exception to the stratification of the remodellers described above is the presence of CHD1 at ‘actively marked’ promoters. At both annotated ‘actively marked’ promoters (Fig. 2c) as well as promoters defined by the chromHMM segmentation (Additional file 1: Figure S5A), CHD1 is found to be significantly enriched at a level comparable to the Group 1 remodellers. As CHD1 does not display the same level of significant enrichment at any other ‘actively marked’ regulatory regions, we speculate that non-epigenetic factors may be driving this high level of promoter binding.
We then took a second approach where we tested the average distribution of the ChIP-seq signal of key gene regulatory histone modifications: H3K4me3, H3K4me1, H3K27me3 and H3K9me3, across the remodeller binding sites (Additional file 1: Figure S6A–D). Our results demonstrated that the active histone marks, H3K4me3 and H3K4me1, displayed a higher average signal across the binding sites of Group 1 remodellers, and repressive histone marks exhibited a higher average signal across Group 2 remodellers. This further confirms the association of Group 1 with ‘actively marked’ regions and Group 2 with ‘repressively marked’ regions. Additionally, at ‘actively marked’ promoters and bivalent promoters defined by these histone modifications, the Pearson correlation coefficient between the remodellers in Group 1 was higher compared to Group 2 (Additional file 1: Figure S6E–F), indicating that there is more similarity in the binding pattern between Group 1 compared to Group 2 at active regions. Furthermore, we found the converse was true for repressed promoters (Additional file 1: Figure S6G–H). Taken together, these data suggest that while all remodellers are associated with ‘actively marked’ chromatin states, Group 1 remodellers have a more pronounced role at these ‘actively marked’ regions, while Group 2 remodellers play a greater role at ‘repressively marked’ regions.
Chromatin remodellers bind to AT-rich DNA and are found at unmethylated regions
We were interested to know whether DNA methylation or the DNA sequence within the remodeller binding site could also stratify the chromatin remodellers into Group 1 and 2. Although chromatin remodellers are responsible for positioning nucleosomes, the genome-wide patterning of nucleosomes is also determined in part by the DNA sequence, where there is a preference for DNA rich in ApT and TpA dinucleotides that are able to bend more sharply around the histone octamer [1, 37, 38]. We calculated the density of all four nucleotides within each chromatin remodeller binding site and found a higher density of A and T nucleotides within the remodeller binding sites for all of the remodellers, and no difference between Group 1 and Group 2 (Additional file 1: Figure S7A-B). Additionally, the ApT and TpA dinucleotides within Group 1 and Group 2 remodeller binding sites occur as frequently as all other dinucleotides in the genome, except for CpG dinucleotides (Additional file 1: Figure S7C–D). Together, this suggests that while intrinsic nucleosome positioning is determined in part by sequence composition, chromatin remodeller nucleosome targeting is not dependent on overall DNA sequence composition, nor does the sequence stratify remodellers into Group 1 and Group 2.
DNA methylation of cytosine residues occurs primarily in a CpG context and has a complex role. At promoters, it is associated with chromatin compaction, but in gene bodies it is associated with active expression [39, 40, 41, 42, 43, 44]. We next examined whether DNA methylation was present at the remodeller binding sites. Overall, we detected very low levels of DNA methylation across all remodeller binding sites (Additional file 1: Figure S7E). However, we found that remodellers bound to regions with a higher CpG density had less DNA methylation within their binding sites and conversely regions with lower overall CpG density displayed higher the levels of DNA methylation (Additional file 1: Figure S7E–F).
Chromatin preferentially occupied by Group 1 remodellers is nucleosome depleted
Accessible chromatin is associated with active regulatory elements and gene activity. Therefore, to determine whether chromatin remodeller binding was correlated with gene expression in LNCaP cells, the ChIP-seq read counts of all Group 1 and Group 2 remodellers were pooled. Then, the difference between the read counts (signal) for Group 1 remodellers compared to Group 2 remodellers at the promoters of expressed genes was calculated. Genes were separated into those with a higher level of Group 1 chromatin remodeller binding and those with a higher level of Group 2 remodeller binding at their promoters and the level of expression for each group plotted as transcripts per million reads (logTPM). Of the 22,393 genes that were actively expressed in LNCaP cells, 14,669 had a higher signal for Group 1 chromatin remodellers and 7724 had a higher signal for Group 2 chromatin remodellers at their promoters. Genes with a higher level of Group 1 remodellers had higher expression compared to those that had a higher expression of Group 2 remodellers (Fig. 4d), supporting the conclusion that Group 1 remodellers are associated with increased gene activity. We performed a gene ontology (GO) analysis of active genes that more highly bound by either Group 1 or Group 2 remodellers using GREAT . For gene promoters with higher Group 1 binding, the most significant GO terms enriched were related to the nucleosome, transcriptional processes such as rRNA binding and tRNA modification, and mRNA processing (Fig. 4e). Enriched GO terms for active genes with a higher Group 2 remodeller signal include those associated with mitochondrial processes such as ATP synthesis and respiratory chain activity (Fig. 4e). Together, this demonstrates that the Group 1 and Group 2 remodellers maintain and associate with the regulation of different cellular pathways.
Group 1 and Group 2 remodellers are defined by chromatin architectural features
We next examined the distribution of chromatin remodellers within the context of higher-order three-dimensional chromatin structure itself using Hi-C data from LNCaP cells . We divided chromatin into TADs (85.4% of the genome), TAD boundaries (2.6% of the genome) and unorganised chromatin (12.0% of the genome; Additional file 1: Figure S8A) and examined the distribution of Group 1 and Group 2 remodellers within each category. Remarkably, we found that 90% to 94% of Group 1 remodellers and 75% to 85% of Group 2 remodeller binding sites were located within TADs or at TAD boundaries, indicating that chromatin remodellers preferentially localised to sites of highly organised chromatin (Fig. 5d). We performed the GAT analysis and found a small but positive and significant enrichment of all eight of the remodellers at both TADs and TAD boundaries (Additional file 1: Figure S8B–C), which was expected as the majority of the genome is within these organised chromatin structures.
Within TADs, chromatin forms DNA loops to facilitate interactions such as those between enhancers and promoters (Fig. 5e), and these loops are in part affected by the positioning of nucleosomes [58, 59]. Our previous work has shown cancer-specific anchor points of long-range chromatin ‘loops’ are enriched for enhancers and promoters and contain a remodelled epigenetic signature, where ‘active’ marks H3K4me1, H3K4me3 and H3K27ac are increased . Whether chromatin remodellers are enriched at the anchor points of these chromatin loops remains unknown. Our linear data suggest ‘active’ chromatin ‘loops’ that bring together promoters and enhancers would be more enriched for Group 1 remodellers compared to Group 2. To test this, we separated the anchor points of the long-range chromatin loops into those that contained at least one active promoter or active enhancer using the chromHMM chromatin state data (Type A anchors, ~ 20% of all anchors) and a second group that did not contain either of these regulatory elements (Type B anchors, ~ 80% of all anchors). We found that at Type A loop anchors, the remodellers continue to stratify into Group 1 and Group 2, with Group 1 chromatin remodellers significantly (p <0.001) more enriched than Group 2 (Fig. 5f, g). Interestingly, Type B loop anchors that were devoid of an active promoter or enhancer were significantly depleted (p < 0.001) of all chromatin remodellers (Fig. 5h), suggesting that these anchors do not require ongoing chromatin remodelling activity. Examples of chromatin loops with Group 1 remodeller binding and active regulatory elements are found at the KLK locus on chromosome 19, and a gene dense region on chromosome 1 (Fig. 5i). Together, this demonstrates a role for all Group 1 remodellers in chromatin three-dimensional architecture.
Our data revealed that the grouping of chromatin remodellers is remarkably consistent across all of the ‘actively marked’ and ‘repressively marked’ epigenetic features we examined but intriguingly, independent of the core modification of DNA methylation. Segregation into these two groups persists at both defined DNA regulatory elements and chromHMM chromatin states, which are a composite of histone modification marks. While Group 1 remodellers have a significantly higher enrichment than Group 2 at ‘actively marked’ regions, it is still noteworthy that all remodellers have some level of enrichment at these regions, which highlights the dynamic and complex nature of active chromatin. Previous research also demonstrates a role of Group 2 remodellers at active chromatin in embryonic mouse cells such as INO80 in maintaining open chromatin in pluripotency genes  and CHD1 and its role in RNA polII stalling in active gene expression . We also note that we find some divergent results for the remodeler CHD1. A previous study in mouse embryonic stem cells found CHD1 to only be enriched at active promoters . In our study, CHD1 had an equivalent level of enrichment at active promoters as the Group 1 remodellers, while still being enriched at ‘repressively marked’ chromatin together with the other Group 2 remodellers. These differences may be due to the different cell types, embryonic cells versus somatic cells, or may reflect differences between normal versus cancer cells. It will be interesting to interrogate these differences in future studies.
It was surprising that upon investigation of the direct overlap within Group 1 and Group 2 remodeller binding sites, we found less than 6% were in common for the entire group. This suggests that, while remodellers within the same group have high statistical enrichment at the same class of regulatory elements, they do not always bind to the exact same genomic region. For example, Group 1 remodellers demonstrate a preference for ‘actively marked’ promoters as a collective group, but they often localise to their own subset of all active promoters, showing that each remodeller potentially has a distinct and unique role. Additional data will be needed to further resolve the types of complexes each of the core remodeller proteins is capable of forming. Expansion of the existing ChIP-seq data to include various accessory subunits will help refine this analysis, in combination with fine-tuning existing definitions of DNA regulatory elements and chromHMM states. Broadly defined, the further subdivision of DNA regulatory elements and states will enable subtyping and will provide additional details to determine under which conditions, states and combinations these remodeller proteins act in a coordinated or antagonistic manner.
For the purposes of this study, we did not include all known histone modifications in our analysis, including methylation of H4K20 and H4K16ac [64, 65, 66]. Thus, the proportion of remodeller binding found at ‘unmarked’ regions may contain histone modifications not present in our analysis. However, the key marks for defining DNA regulatory elements and genes were included and therefore provide a comprehensive view of key gene regulatory chromatin features, and moreover, less than 20% of binding sites fell within the ‘unmarked’ regions. As more histone modification data become available, such as histone variants and histone modifications with a structural role, it will be interesting to determine whether these also stratify the remodellers in a similar fashion.
Interestingly, we found that remodellers consistently segregated into Group 1 and Group 2 at architectural chromatin features, such as chromatin loop anchors and CTCF sites, highlighting that chromatin remodellers are also associated with higher-order chromatin architecture. Previous to this study, BRG1 was the sole remodeller that had a demonstrated role in chromatin architecture and a well-established role in maintaining enhancer–promoter interactions [11, 60, 67]. Additionally, in MCF10A cells, BRG1 increases the stability of TAD boundaries to strengthen enhancer–promoter chromatin loops and maintain the established patterns of gene expression. Subsequent loss of BRG1 binding weakens these interactions, concomitant with a down-regulation of gene expression . Our results infer that in fact, all Group 1 remodellers—BRG1, SNF2H, CHD3 and CHD4—could have individual or a combinatorial role at chromatin loop anchors that contain active promoters or enhancers. When we examined the remodellers at LADs, however, we found them to be depleted of all remodellers. We also investigated the association with the architectural protein CTCF and found that Group 1 remodellers are more enriched at CTCF-binding sites compared to Group 2. Together, our data suggest Group 1 remodellers have a prominent role at ‘active’ chromatin loops, whereas ‘inactive’ chromatin loops do not require remodeller binding to maintain their repressed state.
Our finding that the distribution of chromatin remodellers across CpG islands follows three distinct patterns (BRG1, SNF2H and CHD4 at borders; CHD3 and CHD1 in the centre and SNF2L, INO80 and BRM deplete from the centre of the island) suggests that there are different mechanisms at play for maintaining nucleosome positions at these regions. It was surprising that neither CHD3 nor CHD4 showed any level of enrichment at methylated CpG islands. These remodellers form a key part of the NuRD complex that also contains MBD2, which prefers hypermethylated promoters . It is possible that once the DNA of any given genomic region becomes methylated and the chromatin is compacted, it may no longer require the remodelling complex to stay bound to the chromatin.
Previous studies that have examined two to three remodeller proteins show consistency with our findings. The overlap of BRG1 and CHD4 has previously been reported in various cell types, where they were reported to have opposing control over regulatory chromatin [23, 26, 69, 70]. For example in mice, BRG1, CHD4 and SNF2H have extensive overlap in their binding patterns and it has been implied that the sequential order of their binding is important for their correct function . In our data, we found that these remodeller proteins were also enriched at the same genomic features, suggesting they may also have opposing functions in LNCaP cells. Additionally, we found overlapping enrichment patterns for CHD1 and ISWI remodeller SNF2L, which in yeast have been reported to have both competing and coordinated functions. CHD1 and SNF2L are responsible for maintaining the phasing of nucleosomes at promoters, but compete for different nucleosome spatial arrangements, impacting the kinetics of gene activation [15, 24, 25, 71]. CHD1 and SNF2L also work together at gene bodies where they are thought to maintain chromatin integrity during transcription elongation by preventing histone exchange during nucleosome turnover [72, 73, 74]. Moreover, there has been report of overlapping activity between the ATPase subunits of the NuRD remodelling complex, CHD3 and CHD4 , which also occurred in our data.
Functional studies in human cancers have demonstrated the integrated relationships of chromatin remodellers, through the identification of synthetic–lethal relationships. Synthetic–lethal relationships occur where the cancer develops a loss of function for one protein that creates a dependency on another protein. Synthetic–lethal relationships exist for ATPases, BRG1 and BRM, in triple negative breast cancer, for accessory subunits of the SWI/SNF complex, ARID1A and ARID1B, in colorectal cancer and CHD1 with transcriptional regulator PTEN in prostate cancer [75, 76, 77]. These instances of the relationship between remodellers and their function highlight the importance of studying multiple remodellers together to provide a greater understanding of the overall mechanism of chromatin remodelling and how a chromatin state is established. It is unclear from our data how much of the remodeller binding overlap provides cooperative or antagonist function; moreover, our study does not determine the extent of competition between the remodellers for nucleosome binding. Mechanistic studies are now required, such as using gene editing approaches and in different cell model systems, to further dissect the importance and or redundancy of individual remodellers and their potential functional role in regulating chromatin, and broaden the applicability of findings to other cell types.
In summary, our results reveal previously unknown relationships between the remodellers, and with both the two-dimensional and three-dimensional epigenome. We propose that chromatin remodellers should be examined in the context of the different classes of remodellers we identified as Group 1 or Group 2, and not solely with consideration to existing structural families or in isolation (Fig. 6). These observations may inform decisions for future work that studies chromatin remodeller function and provides a more complete picture of chromatin remodelling action.
ChIP-seq assay and data
LNCaP chromatin remodeller ChIP-seq data are from Ye et al. : GEO accession GSE72690. Histone modification LNCaP ChIP-seq data (H3K4me3, H3K4me1, H3K27ac and H3K27me3) are from Taberlay et al. : GEO accession GSE73785, and Bert et al. : GSE38685. CTCF ChIP-seq data is from Bert et al. : GSE38685. H3K9me3, H3K36me3, Lamin B and Lamin A/C ChIP-seq data are from Du et al. , GSE98732. p300 ChIP-seq data is from Wang et al. : GSE27824. RNA polII ChIP-seq data is from Tan et al. : GSE28264. All ChIP-seq data sets were processed as previously described in Bert et al. , Taberlay et al. , Du et al.  and Lund et al. . Histone modification and chromatin remodeller peaks were called using MACS2  and Lamin domains called with the enriched domain detector (EDD) . Two ChIP-seq input data sets were provided by Ye et al. , which were merged for calling remodeller peaks. Then, MACS2  was used to call peaks on each individual input data set; any peaks overlapping with the chromatin remodellers were removed from the remodeller data sets.
LNCaP Hi-C data is from Taberlay et al. ; GSE73785. Hi-C data were processed through the NGSANE framework v0.5.2  as previously described in Taberlay et al. . TADs were identified with the ‘domain-caller’ pipeline  as described in Taberlay et al. . TADs and TAD boundaries were assessed at 40 kb resolution. The percentage of the genome covered by TADs and TAD boundaries was calculated by accumulating the number of base pairs within TADs or boundaries, divided by 3.095 × 106 and then multiplied by 100. Chromatin loops were called from contact count matrices for 10 kb resolution using a custom adaptation of Fit-Hi-C (contained in NGSANE; Buske et al. ) supplying iteratively corrected bias offsets calculated through HiCorrector v1.1 . Chromatin loops were visualised in the WashU Epigenomics Browser  and Rondo (rondo.ws).
LNCaP RNA-seq data are from Taberlay et al. : GSE73785, and processed clinical prostate tumour RNA-seq was downloaded from TCGA (cancergenome.nih.gov). LNCaP RNA-seq data (n = 3) were processed as described in Taberlay et al. . To determine chromatin remodeller gene expression, reads mapped to hg19/GRCh37 where counted into genes using featureCounts  program and GENCODE v19 used as a reference transcriptome to determine the transcript per million read (TPM) value and biological triplicates were averaged. Processed RNA-seq data (n = 486 tumours) from the TCGA prostate adenocarcinoma cohort were averaged to determine chromatin remodeller expression in clinical prostate cancer samples. The log mean values for each remodeller were plotted as cancer versus normal between the cell lines (logTPM) and the normal and tumour TCGA data sets (logRKPM), along with the linear regression line of best fit. Pearson’s correlation coefficients were calculated in R.
Chromatin accessibility data
DNaseI data is from Thurman et al. , and processed data were downloaded from the ENCODE data portal (encodeproject.org/). DNaseI sites of accessibility from two biological replicates overlapped and the intersection from both replicates were used for downstream analysis (see Remodeller enrichment analysis). NOMe-seq data are from Valdes-Mora et al. : GSE76334. NOMe-seq data were processed as previously described in Valdes-Mora et al. . NOMe GpC methylation levels within remodeller binding sites were defined by first computing the methylation ratio of all GCH sites with greater than 5× coverage and then calculating the mean methylation score within each remodeller binding site. Methylation density of remodeller binding sites was plotted in R. Visualisation of nucleosome occupancy at remodeller binding sites was created using ‘methylationPlotRegions’ from the aaRon package in R, ± 3000 bp from the centre of the remodeller binding site.
DNA methylation data
WGBS sequencing data is from Pidsley et al. : GSE86833. WGBS libraries were processed as previously described Pidsley et al. . CpG islands and Ensembl gene coordinates were downloaded from the UCSC genome browser . Promoter CpG islands were defined as the intersection between CpG islands and the 5′ ends of Ensembl genes. The promoter CpG islands were split into 40 equally sized bins and the average methylation score calculated from the WGBS data using ‘ScoreMatrixList’ from the genomation package in R. Methylated CpG islands were defined as having an average methylation score above 50%. DNA methylation levels within remodeller binding sites were defined by first computing the methylation ratio of all cytosines with greater than 5× coverage and then calculating the mean methylation score within each remodeller binding site. CpG density of methylated and unmethylated CpG islands was calculated in R. Violin and boxplots plots were created in ggplot2 in R. Remodeller binding sites were overlapped with the methylated and unmethylated islands using the GRanges package in R.
ChromHMM segmentation based on Roadmap epigenomics
The Roadmap Epigenomics chromHMM model was based on five core histone modifications (H3K4me3, H3K4me1, H3K36me3, H3K27me3, H3K9me3) and trained on 60 epigenomes  to categorise the genome into 15 chromatin states. There were seven chromatin states associated with ‘active’ chromatin; active promoter, flanking active promoter, transcription at 5′ and 3′ ends of a gene, strong transcription, weak transcription, active intragenic enhancers, active intergenic enhancers and poised enhancers. Three states were associated with bivalent chromatin: bivalent promoter, flanking bivalent promoter and bivalent enhancers. There were four states associated with ‘repressive’ chromatin: zinc finger genes and repeats, heterochromatin, strong polycomb and weak polycomb. The 15th state was ‘unmarked’ chromatin that did not contain any of the histone modifications in the core data set. This model was applied to histone modification ChIP-seq data from LNCaP cells using the chromHMM program (v1.10) , and the chromatin states were collapsed into ‘active’, ‘repressed’, ‘bivalent’ and ‘unmarked’.
Chromatin remodeller enrichment and correlation analyses
To determine whether chromatin remodeller binding sites were enriched at the site of a specific chromatin factor or chromatin regulatory element (histone modifications, CTCF sites, LADs, DNaseI sites, chromatin loop anchors, TADs, TAD boundaries, methylated and unmethylated CpG islands and chromHMM states), we used the genome association tester (GAT; v1.0) . The observed-over-expected fold change and statistical significance were calculated with 10,000 iterations and determine significant if the p value was less than 0.05 or 0.001. The difference between the means of the observed/expected enrichment scores from Group 1 and Group 2 remodellers was compared using the unpaired Student’s T-test in R.
We defined the percentage of overlap between each of the remodellers, and the remodellers with TADs and TAD boundaries by intersecting the peaks identified from the ChIP-seq and Hi-C data. Histone modification average signal plots over chromatin remodeller peaks and the heatmaps of putative active enhancers and chromatin remodeller signal over CpG islands and promoters were created with SeqPlots  and deepTools2 . Histone marks were plotted ± 2 kb from the centre of the remodeller binding sites. At CpG islands remodeller ChIP-seq signal for both average plots and heatmaps were plotted ± 2 kb from the centre of the island. Each row of the heatmap is an individual CpG island and displays the remodeller ChIP-seq signal, sorted by the average signal across all remodellers in decreasing order. Pearson’s correlation matrixes of remodellers at promoters and DNaseI sites were calculated in R.
Gene ontology enrichment
Gene promoters were defined as 2 kb surrounding the TSS of expressed genes. Read counts of the chromatin remodeller ChIP-seq data within the promoters regions of expressed genes were calculated and all Group 1 remodellers merged and separately Group 2 remodellers merged. Subtracting the total read counts of Group 2 remodellers from Group 1 was used to define which promoters had a higher signal for Group 1 and which had a higher signal for Group 2. The promoters assigned to each group were analysed for enrichment of gene ontology terms using GREAT , using the whole genome as background and assigned to the single nearest gene. GO terms reported are the top 10 most significant from the Molecular Function, Biological Process or Cellular Component gene set, with at least 25 observed genes in the data set.
KAG, SJC and PCT were involved in conceptualisation; KAG, CMG, QD and KS were involved in formal analysis; SJC and PCT were involved in funding acquisition; QD and JS were involved in investigation; KAG was involved in writing—original draft; KAG, MPM, JAK, SJC and PCT were involved in writing—reviewing and editing; KAG and MPM were involved in visualisation; JAK, CS, SJC and PCT were involved in supervision; SJC and PCT are senior authors. All authors read and approved the manuscript.
We would like to thank the Garvan Institute of Medical Research for the use of the computing resources.
The authors declare that they have no competing interests.
Availability of data and materials
All sequencing data sets used in this study have been previously published and are available through Gene Expression Omnibus (ncbi.nlm.nih.gov/geo/). GEO accession numbers are as follows: chromatin remodeller ChIP-seq GSE72690; histone modification ChIP-seq data GSE73785, GSE38685, and GSE98732; CTCF ChIP-seq GSE38685; Lamin B and Lamin A/C ChIP-seq GSE98732; p300 ChIP-seq GSE27824; RNA polII ChIP-seq GSE28264; Hi-C GSE73785; RNA-seq GSE73785; NOMe-seq GSE76334; and whole-genome bisulphite sequencing GSE86833. TCGA data are available from the data portal at cancergenome.nih.gov, and ENCODE DNaseI data are available from the ENCODE data portal at encodeproject.org/.
Consent for publication
Ethics approval and consent to participate
K.A.G. is supported by an Australian Postgraduate Award (APA) and Research Excellence Award from UNSW Sydney. Q.D. is supported by an APA from UNSW Sydney. S.J.C. is a National Health and Medical Research Council (NHMRC) Senior Principal Research Fellow #1063559. P.C.T. is a NHMRC Career Development Fellow #1109696. This work was supported by grants from Cure Cancer Australia Foundation Project Grant #1060713 to P.C.T. and NHMRC Project Grants #1011447 and #1070418 to S.J.C. and C.S. and #1051757 to S.J.C. and P.C.T.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
- 2.Giles KA, Taberlay PC. Mutations in Chromatin remodeling factors. Reference Module in Biomedical Sciences. 2018.Google Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.