Enhanced mammalian genome editing by new Cas12a orthologs with optimized crRNA scaffolds
CRISPR-Cas12a/Cpf1, a single RNA-guided endonuclease system, provides a promising tool for genome engineering. However, only three Cas12a orthologs have been employed for mammalian genome editing, and the editing efficiency as well as targeting coverage still requires improvements. Here, we harness six novel Cas12a orthologs for genome editing in human and mouse cells, some of which utilize simple protospacer adjacent motifs (PAMs) that remarkably increase the targeting range in the genomes. Moreover, we identify optimized CRISPR RNA (crRNA) scaffolds that can increase the genome editing efficiency of Cas12a.
KeywordsCRISPR-Cas12a/Cpf1 crRNA Genome editing Human cells Mouse cells
Clustered regularly interspaced short palindromic repeats
Insertion or deletion
Nuclear localization signal
Protospacer adjacent motif
T7 endonuclease I
Clustered regularly interspaced short palindromic repeats (CRISPR)-Cas12a/Cpf1 is the type V A CRISPR-Cas (CRISPR-associated proteins) system that has been recently harnessed for genome editing . Several unique features make Cas12a distinguished from Cas9, providing a substantial expansion of CRISPR-based genome-editing tools. First, Cas12a is a single crRNA-guided endonuclease , while Cas9 is guided by a dual-RNA system consisting of a crRNA and a trans-activating crRNA (tracrRNA) . Second, Cas12a recognizes a 5′ T-rich protospacer adjacent motif (PAM) , different from the 3′ G-rich PAM utilized by Cas9 [3, 4]. Third, after cleavage of double-stranded DNAs (dsDNAs), Cas12a generates staggered ends distal to the PAM site , whereas Cas9 introduces blunt ends within the PAM-proximal target site . Moreover, RuvC and Nuc domains of Cas12a are responsible for target DNA cleavage , whereas Cas9 uses the RuvC and HNH endonuclease domains to cleave the target DNAs . While these diverse properties of the CRISPR-Cas12a system provide potential for the development of versatile tools for genome engineering [1, 7, 8, 9, 10, 11], there are still challenges, including few currently identified orthologs, limited genomic targeting coverage, and relatively low editing efficiency [1, 12, 13, 14, 15]. To address these limitations, we aimed to identify novel Cas12a nucleases with simpler PAM requirements which can increase its targeting range and engineer the crRNA scaffold to achieve higher efficiencies of genome editing.
Results and discussion
Next, we explored the capability of these Cas12a orthologs to cleave the target genomic sequences in mammalian cells. The 12 synthesized Cas12a genes fused with 2 nuclear localization signals (NLSs) at each end were constructed into mammalian expression vectors for Cas12a expression in human and mouse cells (Additional file 2: Figure S3a and Additional file 5: Supplementary Sequences). After transfection, the immunofluorescence staining results showed clear nuclear compartmentalization of the Cas12a proteins in mammalian cells (Additional file 2: Figure S3a). Then, we co-transfected human embryonic kidney 293FT cells or mouse embryonic stem cells (ESCs) with individual Cas12a orthologs and crRNAs (scaffold 1) to target endogenous loci containing the 5′ T-rich PAMs. Results of T7 endonuclease I (T7EI) assay showed that 6 Cas12a nucleases (ArCas12a, BsCas12a, HkCas12a, LpCas12a, PrCas12a, and PxCas12a) could all facilitate genome editing in both human and mouse genomes with the 5′-TTTN PAM (Fig. 1b and Additional file 2: Figure S3b) or 5′-TTN PAM (Additional file 2: Figure S3c). Sanger sequencing results further confirmed the capacity of these 6 Cas12a nucleases to introduce insertions or deletions (indels) at target sites in the mammalian genomes (Additional file 2: Figure S3d-f). We next focused on exploring the in vivo PAM requirement of HkCas12a, which owned the simplest PAM (5′-YYN) in vitro (Additional file 2: Figure S2d). By targeting the human AAVS1, CD34, and RNF2 loci in 293FT cells (Additional file 4: Table S4), we showed that HkCas12a induced indels at target sites with the 5′-YTN and 5′-TYYN PAMs (Additional file 2: Figure S4a, b). Then, we compared the genomic coverage ability of HkCas12a with the previously reported AsCas12a , by targeting the endogenous loci containing requisite PAMs in mammalian genomes. Notably, HkCas12a possessed an expanded genomic coverage capacity than did AsCas12a (Fig. 1c and Additional file 2: Figure S4c). These data demonstrated that we harnessed new Cas12a nucleases for mammalian genome editing with their PAMs determined as 5′-TTN, 5′-YTN, or 5′-TYYN in vivo, which markedly increases the targeting range of Cas12a nucleases in mammalian genomes.
We further characterized the targeting efficiency and specificity of these newly identified Cas12a nucleases. First, we directly compared the relative targeting activities of Cas12a and SpCas9, the efficiency of which is considered as the current gold standard for genome editing. By performing T7EI analyses of targeted indels of endogenous genomic sites, we found that the average targeting efficiencies of these Cas12a proteins (ArCas12a, BsCas12a, HkCas12a, and PrCas12a) are lower than SpCas9, although these Cas12a proteins could achieve higher targeting efficiencies when directed by crRNA 4n96 than crRNA 1 (Additional file 4: Table S7). Meanwhile, to address the off-targeting risks of Cas12a, we performed off-target predictions using Cas-OFFinder  followed by targeted deep sequencing. The results showed that both Cas12a (BsCas12a and PrCas12a) and SpCas9 exhibited a low incidence of off-target mutations at the endogenous DNMT1 site 1 in targeted human 293FT cells (Additional file 4: Table S8). Moreover, the genome-wide off-target analysis by whole genome sequencing (WGS) also showed a low incidence of off-target mutations for both Cas12a and SpCas9 (Additional file 4: Table S9). All these data indicated the minimal off-target risks of Cas12a, which is consistent with previous reports [13, 14].
In this work, we report the identification of six new Cas12a nucleases for genome editing in mammalian cells, including one Cas12a ortholog (HkCas12a) recognizing more flexible 5′-YTN and 5′-TYYN PAMs that can provide broader genome coverage. However, the precise PAM sequences of these orthologs still need to be determined in the future using high-throughput approaches. The non-canonical PAM recognition by HkCas12a was possibly due to the variation of L642 residue, which was equivalent to K592 of LbCas12a and was responsible for the non-canonical PAM recognition (Additional file 2: Figure S8) [20, 21]. As previous studies indicated, crRNA scaffolds could affect or even enhance the targeting activities of CRISPR-Cas systems [1, 12, 13, 18]. Through engineering the nucleotide substitutions at the loop region, we identify a crRNA scaffold that markedly improves the Cas12a-mediated genome editing efficiency. The crystal structures of Cas12a-RNA-DNA complex have shown that the nucleotides in the loop region of crRNA scaffold interact with Cas12a residues [6, 22], indicating nucleotide substitutions in the loop region of crRNA scaffold would affect the activities of Cas12a-crRNA complex . Further structural characterization of Cas12a-crRNA-DNA complexes with different crRNA scaffolds will help the elucidation of the exact mechanisms of this improvement in the future. Collectively, our findings expand the CRISPR-Cas12a genome editing toolbox and may enhance their application in mammalian genome engineering and human gene therapy.
Identify new CRISPR-Cas12a loci
PSI-BLAST program  was applied to identify Cas12a homologs in the NCBI non-redundant protein sequence database using AsCas12a and LbCas12a protein sequences . Cas12a loci not yet harnessed for mammalian genome editing were chosen as candidates for analysis. CRISPR repeats were identified using CRISPRFinder .
crRNA scaffold library construction
Paired degenerate primers were synthesized and annealed to form a duplex with 5′ overhangs (Additional file 4: Table S5). Then, they were constructed into an U6 promoter-driven expression vector (Additional file 5: Supplementary Sequences). The scaffold variants were randomly picked out from cultured plates and then sequenced.
Cell culture, transfection, and fluorescence-activated cell sorting
Human embryonic kidney cell line 293FT and human cervical cancer cell line HeLa were cultured in Dulbecco’s modified Eagle’s medium (DMEM, Gibco) supplemented with 10% fetal bovine serum (FBS, Gibco) and 1% Antibiotic-Antimycotic (Gibco). Mouse embryonic stem (mES) cell line was maintained in N2B27 medium plus 2i (Stemgent) and mLIF (Millipore). The N2B27 medium consists of DMEM/F12 (Gibco) and Neurobasal (Gibco) at a ratio of 1:1 and was supplemented with 1% N-2 supplement (Gibco), 0.5% B-27 supplement (Gibco), 20 ng/ml BSA (Sigma), 10 μg/ml insulin (Roche), 1% GlutaMAX (Gibco), 5% knockout serum replacement (KOSR, Gibco), 0.1% β-mercaptoethanol (Gibco), and 1% Antibiotic-Antimycotic (Gibco). 293FT cells were transfected using Lipofectamine LTX (Invitrogen) following the manufacturer’s recommended protocol. mES cells were transfected via electroporation using Neon™ transfection system (Invitrogen) following the manufacturer’s recommended protocol. For each well of a 24-well plate, a total of 750 ng plasmids (Cas12a-2AeGFP: crRNA = 2: 1) was used. Then, 48 h following transfection, GFP-positive cells were sorted using the MoFlo XDP (Beckman Coulter).
T7 endonuclease I assay for genome modification
Cells were collected after 48 h post-transfection for genomic DNA extraction. GFP-positive cells sorted by FACS were lysed directly using Buffer L (Bimake). The genomic region flanking the Cas12a targeting site of each gene was PCR-amplified (Additional file 4: Table S6), and products were purified using DNA Clean & Concentrator (ZYMO Research) following the manufacturer’s protocol. A total of ~ 200 ng purified PCR amplicons was mixed with 1 μl NEBuffer 2 (NEB) and diluted in ddH2O to 10 μl, then subjected to a re-annealing process to form a heteroduplex according to our previously reported procedure . After re-annealing, the products were treated with T7EI (NEB) following recommending protocol, and 2.5% agarose gels (Takara) were used for further analysis. Indels were calculated via band intensities based on previously reported method .
GFP disruption assay
Human 293FT.eGFP cells harboring a single-copy, integrated AAVS1-eGFP gene were generated by our lab. These cells were transfected with Cas12a expression plasmid and crRNA expression plasmid, or Cas12a expression plasmid and an U6 promoter-driven empty plasmid as a negative control using Lipofectamine LTX (Invitrogen). Three days post-transfection, cells were analyzed on the MoFlo XDP (Beckman Coulter). For each sample, transfections and flow cytometry measurements were performed in triplicate.
We thank Shi-Wen Li, Xi-Li Zhu, Qing Meng and Xia Yang for their help with fluorescence-activated cell sorting.
This study was supported by the Strategic Priority Research Program of the Chinese Academy of Sciences (Grant No. XDA16030400), the National Natural Science Foundation of China (Grant No. 31621004, 31422038), the National Key Research and Development Program (Grant No. 2017YFA0103803), the National Basic Research Program of China (Grant No. 2014CB964800), CAS Key Projects (Grant No. QYZDY-SSW-SMC022, QYZDB-SSW-SMC002).
Availability of data and materials
The datasets supporting the conclusions of this article are included within the article and its additional files. The raw sequence data reported in this paper have been submitted to the NCBI BioSample (https://www.ncbi.nlm.nih.gov/biosample) under accession number PRJNA511656 .
WL and QZ conceived this project, supervised the experiments, and wrote the paper. FT performed the experiments, analyzed the data, and wrote the paper. JL and TC performed the experiments and analyzed the data. KX, LG, QG, and GF performed the experiments. CC and DH analyzed the data. All authors read and approved the final manuscript.
Ethics approval and consent to participate
Consent for publication
A patent application has been filed relating to this work. The authors declare that they have no competing interests, and they plan to deposit the reagents in Addgene to freely share with the academic community.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
- 17.Zetsche B, Strecker J, Abudayyeh OO, Gootenberg JS, Scott DA, Zhang F. A survey of genome editing activity for 16 Cpf1 orthologs. bioRxiv. 2017. https://doi.org/10.1101/134015.
- 18.Li B, Zhao W, Luo X, Zhang X, Li C, Zeng C, Dong Y. Engineering CRISPR-Cpf1 crRNAs and mRNAs to maximize genome editing efficiency. Nat Biomed Eng. 2017;1:0066.Google Scholar
- 26.Teng F, Li J, Cui T, Xu K, Guo L, Gao Q, Feng G, Chen C, Han D, Zhou Q, Li W. Enhanced mammalian genome editing by new Cas12a orthologs with optimized crRNA scaffolds. BioSample. https://www.ncbi.nlm.nih.gov/bioproject/PRJNA511656/.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.