Quantitative profiling of BATF family proteins/JUNB/IRF hetero-trimers using Spec-seq
BATF family transcription factors (BATF, BATF2 and BATF3) form hetero-trimers with JUNB and either IRF4 or IRF8 to regulate cell fate in T cells and dendritic cells in vivo. While each combination of the hetero-trimer has a distinct role, some degree of cross-compensation was observed. The basis for the differential actions of IRF4 and IRF8 with BATF factors and JUNB is still unknown. We propose that the differences in function between these hetero-trimers may be caused by differences in their DNA binding preferences. While all three BATF family transcription factors have similar binding preferences when binding as a hetero-dimer with JUNB, the cooperative binding of IRF4 or IRF8 to the hetero-dimer/DNA complex could change the preferences. We used Spec-seq, which allows for the efficient and accurate determination of relative affinity to a large collection of sequences in parallel, to find differences between cooperative DNA binding of IRF4, IRF8 and BATF family members.
We found that without IRF binding, all three hetero-dimer pairs exhibit nearly the same binding preferences to both expected wildtype binding sites TRE (TGA(C/G)TCA) and CRE (TGACGTCA). IRF4 and IRF8 show the very similar DNA binding preferences when binding with any of the three hetero-dimers. No major change of binding preferences was found in the half-sites between different hetero-trimers. IRF proteins bind with substantially lower affinity with either a single nucleotide spacer between IRF and BATF binding site or with an alternative mode of binding in the opposite orientation. In addition, the preference to CRE binding site was reduced with either IRF binding in all BATF–JUNB combinations.
The specificities of BATF, BATF2 and BATF3 are all very similar as are their interactions with IRF4 and IRF8. IRF proteins binding adjacent to BATF sites increases affinity substantially compared to sequences with spacings between the sites, indicating cooperative binding through protein–protein interactions. The preference for the type of BATF binding site, TRE or CRE, is also altered when IRF proteins bind. These in vitro preferences aid in the understanding of in vivo binding activities.
KeywordsBATF JUNB IRF Transcription factors Specificity
The signature characteristic of basic leucine zipper (bZIP) transcription factors is the alpha-helical bZIP domain that contains both a DNA binding region and a leucine zipper motif. The leucine zipper motif allows bZIP transcription factors to form either hetero- or homo-DNA binding dimers . One of the most well-known examples of hetero dimerizing bZIP transcription factors is the FOS–JUN dimer which is also known as activator protein 1 (AP-1). AP-1 family proteins are known to be able to regulate gene expression either on their own, or with a partner via closely spaced DNA-binding sites [2, 3]. Basic leucine zipper transcription factor ATF-like (BATF) family transcription factors (BATF, BATF2, and BATF3) belong to the family of bZIP transcription factors and are considered as AP-1 transcription factors due to their DNA binding preferences. BATF family proteins form hetero-dimers with JUN family proteins and can recognize the 7-long TPA response elements (TRE: TGA(C/G)TCA) or the 8-long cyclic AMP response element (CRE: TGACGTCA) [4, 5, 6]. The bZIP domain of all three BATF family members are highly conserved. None of the BATF transcription factor have a transcriptional activation domain, and are considered to act as inhibitors of AP-1 activity . BATF and BATF3 are relatively small compared to other bZIP transcriptional factors (125 and 118 amino acids, respectively) and contain no additional domains other than bZIP. BATF2 has an extra carboxy-terminal domain of unknown function.
mRNA expression analysis showed that BATF and BATF3 were highly expressed in lymphocytes while BATF2 is mostly expressed in macrophages . While sometimes expressed in the same cell types, each BATF family member has specific functions. For example, BATF is found to control TH17 differentiation  and BATF3 is required for the development of CD8a classical dendritic cells (cDC) . Interestingly, BATF and BATF3 can cross-compensate in vivo in T cells and dendritic cells, but BATF2 can only compensate for BATF3 in dendritic cells . The mechanism for how the family members compensate for each other is not clear.
Interferon regulatory factors (IRFs) family transcription factors have diverse roles in regulating the immune system. IRFs have a conserved DNA binding domain (DBD) known to bind to the interferon-stimulated response element (ISRE) by itself [12, 13]. While the mammalian IRF family comprises nine members from IRF1 to IRF9, only IRF4 and IRF8 are known to cooperatively function with BATF family transcription factors. Structurally, IRF4 and IRF8 contain an IRF-association domain (IAD) C-terminal to the DBD. When binding cooperatively with BATF, the IAD is proposed to interact with the leucine zipper region on the BATF and the DBD binds to “GAAA” motif either 0 or 4 base pairs away from the TRE in opposite orientations [11, 14, 15].
The basis for the differential actions of IRF4 and IRF8 with BATF factors is still under investigation. One potential explanation could be the subtle differences in cooperative DNA binding between BATF factors and IRFs. Iwata et al. found that a “T” preference 8 base pairs 5′ to the TRE can affect the strength of T cell antigen receptor signal . We propose that the differences in function between these hetero-trimers is caused by differences in their DNA binding preference. We used Spec-seq, which allows for the efficient and accurate determination of relative affinity to a large collection of sequences in parallel [17, 18, 19, 20, 21], to find differences between cooperative DNA binding of IRF4, IRF8 and BATF family members.
Spec-seq is based on the principle that the relative binding affinities of a collection of DNA sequences can be measured by separating the bound and unbound fractions of DNA and determining the ratios of each sequence in the two fractions (see “Methods”). We have used this principle to measure binding specificity many times previously, but with methods that were low-throughput, allowing the measurement of relative affinity to only a few sequences per assay [22, 23, 24, 25, 26, 27]. With the development of new sequencing technologies, Spec-seq allows that principle to be applied to measure the relative binding affinities of hundreds to thousands of sequences per assay [17, 18, 19, 20, 28]. We have recently demonstrated that it can be easily extended to measure the effects of modified bases on binding affinity, and also showed its high accuracy by comparison with a two-color competitive fluorescence anisotropy method . Spec-seq can also be readily adapted to measuring the cooperativity of binding between two proteins to the same DNA sequence, in a method we call Coop-seq [17, 29, 30]. In this paper Spec-seq is applied for the first time to the study of hetero-trimeric protein-DNA complexes.
Spec-seq of BATF/BATF2/BATF3 with JUNB
IRF4 and IRF8 spec-seq with BATF/BATF2/BATF3 and JUNB
Change in BATF/BATF2/BATF3 and JUNB specificity with IRF4 and IRF8 binding
We have found that quantitative specificities of BATF, BATF2 and BATF3 are all very similar over a large collection of binding sites. The main difference being that BATF2 and BATF3 have a slight preference for 8-long CRE sites over 7-long TRE sites that is not observed for BATF. IRF4 and IRF8 have very similar specificities in combination with any of the BATF proteins. In every case there is a preference for IRF sites that are immediately adjacent, 0 spacer sites, to those that have a single base in between, which strongly suggests cooperative binding through protein–protein interactions . The preference for the 0 spacer sites over the 4 spacer sites, with the IRF site in the opposite orientation, is even stronger. The fact that such combinations are observed in in vivo binding sites  suggests that there are other, currently unknown, factors contributing to the complex formation in vivo. Although the specificities of the BATF proteins are very similar, as are those of the IRF proteins, there are some significant differences in the interaction energies that may account for differential binding in vivo.
BATF, BATF2 and BATF3 each can form dimers with JUNB and bind DNA with very similar specificities. Each dimer can also interact with IRF4 and IRF8 to form hetero-trimeric protein complexes that bind to DNA with similar, but somewhat distinct quantitative preferences, especially regarding the spacings between the monomeric sites. Spec-seq is an effective method to measure the relative affinities to hundreds of alternative binding sites in parallel.
Protein expression and purification
Full length human BATF, BATF3 and a truncated version of BATF2 (aa 1–142) were cloned into a pUC19 based plasmid with T7 promoter and T7 terminator. Only the N-terminal bZIP domain of BATF2 was used to make it equivalent to BATF and BATF3 and because earlier work had shown that the full length BATF2 did not bind TRE sequences with JUNB [32, 33]. Each protein construct contains a N-terminal mCherry followed by a cleavage site for Tobacco Etch Virus nuclear-inclusion-a endopeptidase (TEV protease) and finally the actual protein of interest. In addition, a truncated version of human JUNB (aa 148–347) with C-terminal 6-histidine (6His) tag were cloned into a pBR322 based plasmid with T7 promoter, T7 terminator, Kanamycin resistance and no rop gene. Each BATF plasmid was co-transformed with the JUNB plasmid into SHuffle T7 Express Competent E. coli (NEB) and grown in Luria broth LURIA BROTH (Sigma). Protein expression was induced by adding 0.4 mM isopropyl-B-thiogalactoside (IPTG) for 16 h at 25 °C. The proteins were purified using Ni–NTA agarose (Qiagen) following manufacturer’s instructions, mCherry-colored flow through were collected. The mCherry on BATF proteins serves as an indicator for mCherry-BATF existence. Since the BATF proteins contain only mCherry and no affinity tags, all 6His purified proteins were hetero-dimerized BATF–JUNB. The mCherry on BATF proteins were cleaved off by using ProTEV Plus (Promega) following manufacturer’s instructions.
Full length human IRF4 and mouse IRF8 were cloned into a pUC19 based plasmid with T7 promoter and T7 terminator containing N-terminal strep-tag followed by cleavage site for thrombin protease as described (39). The construct was transformed into Escherichia coli BL21(DE3) and grown in LURIA BROTH (Sigma). Protein expression was induced by adding 0.4 mM isopropyl-B-thiogalactoside (IPTG) for 3 h at 30 °C. The proteins were purified using Strep-Tactin Superflow (IBA Life Sciences) following the manufacturer’s instructions. The strep-tag was cleaved off by thrombin protease digestion for 8 h at room temperature.
Library design and preparation
The BATF–JUNB Spec-seq library was designed by flanking the degenerate sequences of interest (those in Fig. 1b) with 5′ flanking sequence of GATAGTCTCATTTTCACCCCGT and 3′ flanking sequence of TTGTTCCATTACAGTATCTGT for downstream processing. The IRF Spec-seq library was designed by flanking the degenerate sequences of interest (those in Fig. 2b) with 5′ flanking sequence of GAGTCGTCTCGTCAGCACTA and 3′ flanking sequence of CCGTAGAGCACTCAGGTC for downstream processing. Libraries were procured by ordering single stranded DNA oligos from IDT. To make double-stranded DNA (dsDNA) libraries, 100 pmol single-strand degenerate template sequences were mixed with an equal amount of appropriate reverse complement primer (ACAGATACTGTAATGGAAC or GACCTGAGTGCTCTACGG). In the presence of Taq Polymerase (Lambda Biotech), brief 10-s denaturing followed by 10 min of 55 °C annealing/extension is sufficient to make dsDNA libraries. Because any unextended single-stranded DNA (ssDNA) could contaminate the unbound band, the reaction mix was digested by 1 ml NEB Exo I exo-nuclease (New England Biolabs) for 30 min. All final dsDNA products were purified by PCR purification columns (QIAGEN) and eluted in MilliQ water (Millipore).
All binding reactions were done in a 10 µl reaction volume using 100 nM BATF proteins-JUNB heterodimers, 150 nM IRF proteins if needed, 1μM of dsDNA library in 1× NEB Cutsmart buffer (50 mM Potassium Acetate; 20 mM Tris–acetate; 10 mM Magnesium Acetate; 100 μg/ml BSA, pH 7.9 @25 °C) supplemented with 10% glycerol and were incubated for 30 min on ice. Electrophoresis mobility shift assays (EMSA) were done using native 9% PAGE prepared as Tris/Glycine (25 mM Tris pH 8.3; 192 mM glycine) mini-gels (Bio-Rad). These gels were first pre-run using 1× Tris/Glycine buffer at 200 V for 30 min, then samples were loaded and gels were run for an additional 40 min at 200 V at 4 °C. After EMSA, the gels were stained with ethidium bromide and visualized using Bio-Rad gel imager. Each band detected in the EMSA were excised with a disposable sterile toothpick and the DNA in the gel extracted by incubating for 30 min at 50 °C in 50μl acrylamide gel extraction buffer [500 mM Ammonium acetate; 10 mM magnesium acetate; 1 mM EDTA; 0.1% sodium dodecyl sulfate (SDS)]. Samples in the extraction buffer were purified with QIAquick Nucleotide Removal Kit (Qiagen) following the manufacturer’s instructions and recovered using MilliQ water (Millipore). Each fraction of DNA was barcoded and amplified using HotStart PCR Master Mix (Lambda Biotech). DNA was denatured at 94 °C for 30 s, annealed at 55 °C for 30 s and extend at 72 °C for 45 s per round for 12–20 rounds with modified Indexed-Illumina primers (PE1-Genetics1/2, PE2.0) (Additional file 7). The PCR product was then purified again using QIAquick Nucleotide Removal Kit. Multiple samples were pooled and sequenced and analyzed as previously described .
To obtain the relative affinity of the TF to a collection of sequences, S1…S n , (which for convenience we label K1…K n ) requires only measuring the distribution of those sequences in the bound and unbound fractions and the none of the concentrations, including that of the free protein, are needed:
YC conducted the experiments, processed the data and contributed in writing the manuscript. ZZ pioneered the technique and helped with experimental design and data analysis. GS supervised all experiments and data analysis and was a major contributor in writing the manuscript. All authors read and approved the final manuscript.
We thank Theresa Murphy from the Kenneth Murphy lab for providing the DNA template for the IRF8 coding sequence.
The authors declare that they have no competing interests.
Availability of data and materials
Raw reads for all the experiments have been deposited in the NCBI short read archive under accession GSE102219.
Consent for publication
Ethics approval and consent to participate
This work was supported by NIH Grant HG000249.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.