BE-FLARE: a fluorescent reporter of base editing activity reveals editing characteristics of APOBEC3A and APOBEC3B
Base Editing is a precise genome editing method that uses a deaminase-Cas9 fusion protein to mutate cytidine to thymidine in target DNA in situ without the generation of a double-strand break. However, the efficient enrichment of genetically modified cells using this technique is limited by the ability to detect such events.
We have developed a Base Editing FLuorescent Activity REporter (BE-FLARE), which allows for the enrichment of cells that have undergone editing of target loci based on a fluorescence shift from BFP to GFP. We used BE-FLARE to evaluate the editing efficiency of APOBEC3A and APOBEC3B family members as alternatives deaminase domains to the rat APOBEC1 domain used in base editor 3 (BE3). We identified human APOBEC3A and APOBEC3B as highly efficient cytidine deaminases for base editing applications with unique properties.
Using BE-FLARE to report on the efficiency and precision of editing events, we outline workflows for the accelerated generation of genetically engineered cell models and the discovery of alternative base editors.
KeywordsBase editing Fluorescent reporter CRISPR/Cas9 APOBEC Gene editing
Experimental and therapeutic modification of genomic DNA has become a more rapid and efficient process due to the development of CRISPR-Cas-based technologies. Base editing is a recently developed derivative of CRISPR-Cas-mediated genome editing [1, 2]. The third iteration of the Base Editor protein (BE3) is a fusion of three enzymes: rat APOBEC1 cytidine deaminase, Cas9 D10A nickase, and uracil DNA glycosylase inhibitor (UGI) . This multi-enzyme complex can introduce high-frequency C to T mutations (or G to A on the complementary strand) through enzymatic deamination of cytidine to uracil at the targeted locus. Replication across the uracil will lead to incorporation of a thymidine at this position due to the misrecognition of uracil as thymidine by DNA polymerases. The base excision repair pathway enzyme, uracil DNA glycosylase, could recognise and remove the uracil; however, the UGI component in BE3 provides local inhibition of such repair. Cas9 nickase allows for guide RNA-mediated targeting, and through nicking of the non-edited strand, engenders repair using the edited strand as a template .
Introduction or correction of mutations using CRISPR-Cas9 generally depends on DNA double-strand breaks and homology-directed repair (HDR) using an exogenous DNA repair template. This can be a very inefficient process, dependent upon the cell type and cell cycle phase [4, 5, 6]. Furthermore, DNA double-strand breaks generated by Cas9 are resolved in an unpredictable manner, often leading to undesirable outcomes such as insertions and deletions (InDels) and translocations . Base editing has unique advantages in this respect; independence from DNA double-strand break formation and HDR leads to reduced rates of InDel formation and a high efficiency of editing in a broader range of cellular contexts . However, producing genetically engineered cell models using base editing still depends on single-cell cloning and sequencing of genomic DNA to find successfully edited cells; this is often the rate-limiting step in the procedure, and gains in the efficiency of this process have the potential to greatly reduce timelines in cell model generation.
Fluorescent reporters developed to discriminate between CRISPR-Cas9-mediated HDR or NHEJ events have facilitated the enrichment of cells with desired DNA repair outcomes and led to improvements in increasing HDR rates in genome engineering [8, 9]. In addition, T2A self-cleaving peptide fusions with fluorescent proteins are common for selecting enriched pools of transfected cells in gene editing experiments. However, a system for reporting on base editor point mutation activity in mammalian cells, which allows for edited cell enrichment and refinement of base editor architecture, has yet to be demonstrated. We used base editing to introduce a well-documented single amino acid substitution in enhanced Blue Fluorescent Protein (eBFP) that leads to a spectral shift associated with a transition to Green Fluorescent Protein (GFP) . By fluorescently marking BE-active cells, we quantitatively assessed efficiencies of different BE variants incorporating alternative APOBEC enzymes and demonstrate FACS-based enrichment of genetically modified cells including gene knock-outs and clinically relevant point mutations. We predict that our reporter will expedite cell model generation with base editing.
Validation of a Base Editing FLuorescent Activity REporter (BE-FLARE)
In addition to transient expression of BE-FLARE, we could stably integrate BE-FLARE using ObLiGaRe-mediated integration into the AAVS1 safe-harbour locus , thus allowing for permanent fluorescent demarcation of edited cells. A time-course of digital droplet PCR and microscope imaging of PC9-BE-FLARE cells after editing showed DNA editing of BE-FLARE as early as 18 h and edited cells expressing GFP protein from 48 to 72 h post-transfection (Additional file 1: Figure S2).
Enrichment of edited cells using BE-FLARE
In a reciprocal approach, we used flow cytometry to sort for GFP-positive cells following simultaneous base editing of EGFR and BE-FLARE and then quantified gefitinib resistance in GFP-positive versus mock-sorted cells (total viable population; Fig. 2d). We observed a similar co-enrichment using this approach; GFP-positive cells exhibited enhanced levels of resistance to gefitinib relative to mock-sorted controls, with a ~ 4.5-fold increase in cell growth observed after 5 days of gefitinib treatment (Fig. 2d). There was no observable resistance conferred from a non-targeting gRNA in GFP-sorted cells. Thus, an integrated BE-FLARE can be used to enrich for genetic co-editing events at secondary loci.
Next, we sought to determine how BE-FLARE would compare with marking BE-transfected cells with a fluorescent reporter (TurboRFP) coded in the BE transcript via a self-cleaving T2A peptide (Fig. 3c). Interestingly, the majority of cells that had undergone base editing of BE-FLARE (i.e. GFP-positive cells) remained RFP-negative (Fig. 3d). Thus, expression of BE3-T2A-RFP below the level detectable by flow cytometry is still functional in cells, implying that using RFP expression for selection would significantly underestimate the number of edited cells. Moreover, the high levels of BE3 expression apparent in the RFP-positive population may not be desirable for many applications due to the possibility of increasing off-target editing.
Activity measurement of APOBEC-Cas9 fusion variants using BE-FLARE
We further analysed the editing pattern of the three APOBEC-BE3 variants by next-generation sequencing of the targeted BFP amplicons in GFP-positive cells isolated by FACS (Fig. 4d). On-target C->T editing within the optimal editing window at codon H66 was higher for rA1 and A3B than A3A. We observed equal editing frequency of the CAC H66 codon for rA1 and A3A but higher stringency observed for A3B, which seems more capable of discriminating between these two proximal cytosine targets within the optimal editing window. We observed similarly low levels of non-C->T mutations, including C->G and C->A variants for all three APOBEC variants. Interestingly, A3A produced more bystander C->T and G->A (on the complementary strand) mutations than either rA1 or A3B versions of BE3. Moreover, we observed low-frequency G->A mutations as far as − 86 (1.3 ± 0.04% allele frequency) and + 76 (1.3 ± 0.1% allele frequency) relative to the protospacer start position, exclusively in the A3A-edited samples (not shown). Notably, one of the bystander mutations produced a premature stop codon (21.8 ± 2.2% allele frequency; Fig. 4d), which likely explains the reduction in GFP expression observed in A3A-BE3 edited cells.
To better understand the potential mechanisms behind the differing mutational characteristics of each APOBEC-BE3 variant, we analysed the protein expression of the base editors (Fig. 4e). A3A-BE3 had the highest expression of all at the time points tested, perhaps explaining the increased levels of undesired bystander mutations. rA1-BE3 levels were comparable to that of A3B-BE3 but decayed more rapidly over time. Indeed, at 72 h post-transfection, rA1-BE3 was undetectable by Western blotting, whereas A3B-BE3 was still at levels comparable to 24 h post-transfection. Taken together, these results suggest that these APOBEC-Cas9 fusions have drastically different protein expression and/or stability in mammalian cells, which may partially explain base editing characteristics such as efficiency and precision. Notably, we used a codon-optimised version of rat APOBEC1 in BE3 throughout this report, as we found that the codon-optimised version was expressed at much higher levels than the native sequence when assessing RFP expression in a BE3-T2A-truboRFP system (Additional file 1: Figure S5), which is consistent with recent reports [16, 17].
In conclusion, our deep sequencing analyses are broadly consistent with results generated from editing BE-FLARE. We demonstrate that A3A-BE3 has a broader mutational profile leading to higher bystander mutation rates which is consistent with loss of GFP fluorescence in the BE-FLARE system. Thus seen, BE-FLARE is a valuable tool for measuring the efficiency and precision of novel base editors.
We employed BE-FLARE to evaluate the efficiency and precision of different base editor variants. Replacement of rat APOBEC1 in BE3 with human APOBEC3B resulted in a similar level of activity and specificity as the original, codon-optimised rat APOBEC1. Surprisingly, A3A-BE3 induced greater bystander mutations, which we further confirmed at multiple genomic on-target and off-target sites. Although A3A and A3B C-terminal domain share ~ 90% similarity in protein sequence, the active site in A3A is open whereas in A3B it is partially occluded by the flexible loop 1 region (Additional file 1: Figure S6) [18, 19, 20]. Whilst these structural differences could explain the increased bystander rates of A3A-BE3, we also observed a significant increase in protein expression over time, which suggests there could be a fine balance between protein abundance/stability and precision of base editors. This is supported by the recent finding that codon optimisation of BE3 constructs significantly increase activity [16, 17], suggesting that protein expression from the original BE3 constructs is a limiting factor. Whilst undesirable for precision applications, the increased bystander editing frequency of A3A-BE3 may prove beneficial for targeted mutagenesis approaches similar to the CRISPR-X system [21, 22]. A recent study demonstrated that several point mutations in A3A can reduce bystander mutation rates of A3A-BE3 , showing that this highly active cytidine deaminase can be rationally refined for gene editing. Our findings imply that A3B may also prove to be an excellent starting point to develop more precise base editors with increased editing efficiency, whilst the reduction of immunogenic peptides compared to rA1 could be beneficial in therapeutic settings.
We have developed a capability to enrich genetically edited cells through selection based upon a fluorescent reporter of base editing activity. Transient expression and selection of BE-FLARE-positive cells allowed significant enrichment of base editing at secondary sites. This methodology is broadly applicable and can help when generating a modified cell line by reducing the number of clones screened to identify the desired genotype, especially when no phenotypic selection is possible or where the desired mutation has deleterious effects on cell fitness. The importance of this benefit is highlighted by the low level of base editing of EGFR observed in mock-selected pools compared to the significant increase after BE-FLARE enrichment. This low level of editing is likely a result of the high EGFR copy number in PC9 cells, which are dependent upon mutant EGFR signalling for survival . As gene copy number alterations are common in cancer cell lines, enrichment before generation of single cell clones offers an invaluable tool to improve the success of cell model generation. At certain sites, BE-FLARE enrichment generated very high levels of editing in bulk pools, which in some cases may avoid the need to use single-cell clones entirely.
Alternative BE reporter systems have been recently described; a fluorescent Stop-GFP reporter, where a stop codon is mutated in order to activate GFP expression, was used to monitor activity of Cas9 fused to activation-induced cytosine deaminase [21, 23]. This reporter lacks a detectable signal before editing, making it difficult to monitor the efficiency of reporter delivery, or infer ratios of edited versus unedited reporter. Another study described a reporter which required a dual guide approach to generate two proximal uracil—ssDNA nicks, repair of which can lead to in-frame InDel events and restoration of the mCherry reading frame . The authors achieved enrichment of edited cells by sorting mCherry-positive cells, but how sensitive InDel formation is as a measure of base editor activity is unclear. In contrast, BE-FLARE provides a direct read-out on desired point mutation events, which is the prevailing product of base editing . Finally, a similar BFP to GFP strategy has been reported to detect base editing events in plant cells ; our complementary data in human cells represent the first instance of its use in selection for co-editing events. A current limitation of our reporter strategy is that it provides a relative measure of BE/cytidine deaminase activity rather than an absolute measure, since a small number of editing events may result in InDels or perfect DNA repair, both of which will not give rise to GFP-expressing cells. We envisage that BE-FLARE could be further refined to contain a single cytosine in the histidine 66 codon to allow for easy reversal of editing back to the WT BFP sequence using the recently published A->G Adenosine base editor (ABE) . This would allow for tracking of cells that have undergone a transition from BFP to GFP and then back to BFP, facilitating genetic rescue experiments on endogenous genes.
Using an alternative approach, we marked transfected cells with a co-expressed fluorescent protein. Whilst this system is suitable for transient transfection, the large size of the final expression cassettes at nearly 8.5 kb precludes such delivery by lentivirus, where cargo sizes are limited . Importantly, BE-FLARE directly reports on base editor activity rather than simply the expression of BE3, providing a more functional read-out. Separate delivery of BE-FLARE allows for maximum flexibility and applicability in multiple cell systems and with various base editor versions. Finally, cell lines stably expressing the BE-FLARE allow for tracking of edited cells over time to monitor phenotype, which is not possible with transient expression of a fluorescent protein. Indeed, as GFP fluorescence is amenable to detection by fluorescence microscopy, BE-FLARE can be applied to detect base editing in high-throughput functional genetic screens. This reporter may also be employed in therapeutic genome editing, where it is important to select for rare editing events in primary cells without introducing a permanent genetic marker.
In conclusion, we present BE-FLARE as a rapidly implementable system for tracking and selecting base edited cells and refining the next generation of base editors.
HEK 293 and PC9 (both from ATCC) cells were maintained at 5% CO2, 95% air in RPMI, 10% FCS, 1 X GlutaMAX (ThermoFisher). Transfections were performed using FuGENE HD (Promega) using a 3:1 ratio of transfection reagent to DNA according to the manufacturer’s instructions. Cell lines were STR profiled and verified as mycoplasma-free.
Cloning and plasmids
The BE3 expression cassette was synthesised (ThermoFisher) and cloned into pcDNA3.1(+). We introduced a cassette into the Mlu site containing an AarI guide cloning region with ccdB for selection, and the human U6 promoter driving gRNA expression. gRNA sequences were cloned into the AarI site using complementary primer pairs, which were annealed, phosphorylated, and ligated into the linearised vector. Primers can be found in Additional file 1: Table S1. For the BFP reporter construct, a gBlock encoding eBFP was synthesised (IDT) and introduced by Gibson assembly (NEB) into an expression vector under the human EF-1 alpha promoter. The vector contains sequences to allow ObLiGaRe-mediated integration into the human AAVS1 ‘safe harbour’ locus . All sequences of the synthesised cassettes and guide RNAs are listed in Additional file 1: Supplementary Methods. VEGFA, EMX1 and non-targeting guide RNAs are published [15, 28, 29].
Generation of stable BE-FLARE cell lines
HEK293 and PC9 cells were transfected with BE-FLARE plasmid and a construct encoding zinc-fingers targeting the AAVS1 safe-harbour locus, essentially as described , and subsequently selected for 3 days with puromycin (1 μg/ml).
FACS was carried out on a FACSJazz (BD Biosciences), and flow cytometry analysis was carried out on a Fortessa (BD Biosciences). Briefly, cells were transfected with the indicated constructs and, 3 days later, harvested by trypsinisation for flow cytometry analysis or FACS.
Forty-eight hours after transfection with the indicated BE3 variant (1 μg of plasmid per well of 12-well plate with FuGene HD; Promega), genomic DNA was generated from the resultant pool of HEK293 cells using DNA Blood/Tissue Kit (Qiagen). PCR1 amplicons were generated using primers containing adapter sequences as stated in Additional file 1: Table S2. PCR1 primers for human (HEK293 cell) EMX1 , VEGFA , and VEGFA off-targets  are published. Genomic DNA was amplified based on the predetermined minimal PCR cycle number required, which ranged between 22 and 25 cycles. Indexing primers were added in a second PCR step with a further 10 PCR cycles using 1 ng of purified PCR product from PCR1. For all PCR reactions, amplicons were cleaned-up using MAGBIO magnetic SPRI beads and amplicon size was validated using the QIAxcel (QIAGEN). Libraries were quantified using KapaQuant qPCR kit (KAPA Biosystems), pooled and sequenced on a MiSeq (Illumina).
Base editing efficiencies were estimated from Sanger sequence chromatograms using EditR , or by analysis of NGS. For amplicon sequencing data analyses, Fast Length Adjustment of Short reads (FLASH v1.2.11) was used to group paired reads. BWA-MEM was used to align to the human genome (hg19) or the BFP coding sequence. Samtools was used to generate sorted, indexed BAM files. Samtools was used to generate data for variant calling with the following options: minimum read depth 50, minimum quality 25, minimum allele frequency 0.005, maximum mismatch 100, and trim 20 .
Whole cell lysates were generated using RIPA buffer (ThermoFisher Scientific), and Western blotting was performed using standard methods, with secondary antibodies conjugated to horseradish peroxidase (GE Healthcare). Cas9 (#14697; RRID: AB_2750916) and GAPDH (#2118; RRID: AB_561053) antibodies were from Cell Signaling Technology.
Digital droplet PCR (ddPCR)
Base editing over time was estimated by extraction of genomic DNA with DNAeasy Blood & Tissue kit (Qiagen) followed by ddPCR with ddPCR Supermix for probes no dUTP (BioRad) according to the manufacturer’s instructions. Probes were labelled with FAM and are listed in Additional file 1: Supplementary Methods.
Experimental design and statistics
The exact value of sample size (n), statistical tests used, and the number of independent experiments performed are given in the figure legends. Unless otherwise stated, error bars represent standard deviation and an unpaired Student’s t test was used to assess statistical significance (P < 0.05).
Thanks to the Discovery Biology IMED Biotech Unit, AstraZeneca, for the helpful discussion. We thank Daniel O’Neill for providing valuable advice regarding the editing of EGFR and the NGS team for their assistance with DNA sequencing. MC is a fellow of the AstraZeneca postdoc programme.
This work was funded by AstraZeneca plc.
Availability of data and materials
All data generated or analysed during this study are included in this published article, its supplementary information files, and publicly available repositories. Sequencing data is available from the NCBI Sequence Read Archive database, accession: SRP153020. Raw data relating to figures and supplemental figures can be found in Additional file 2.
MAC and BJMT designed the study. MAC carried out the experiments. MF provided bioinformatics support for NGS data analysis. SL, LSP and GC provided reagents. EC, JDW, MM and BJMT provided conceptual advice. MAC and BJMT wrote the manuscript. All authors read and approved the final manuscript.
Ethics approval and consent to participate
Consent for publication
The authors declare no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
- 2.Nishida K, Arazoe T, Yachie N, Banno S, Kakimoto M, Tabata M, et al. Targeted nucleotide editing using hybrid prokaryotic and vertebrate adaptive immune systems. Science. 2016;353. https://doi.org/10.1126/science.aaf8729.
- 13.Godin-Heymann N, Ulkus L, Brannigan BW, McDermott U, Lamb J, Maheswaran S, et al. The T790M “gatekeeper” mutation in EGFR mediates resistance to low concentrations of an irreversible EGFR inhibitor. Mol Cancer Ther. 2008;7:874–9. https://doi.org/10.1158/1535-7163.MCT-07-2387.CrossRefPubMedGoogle Scholar
- 31.Kluesner MG, Nedveck DA, Lahr WS, Garbe JR, Abrahante JE, Webber BR, et al. EditR: a method to quantify base editing from Sanger sequencing. Cris J. 2018; 1(3). https://doi.org/10.1089/crispr.2018.0014.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.