Web-based design and analysis tools for CRISPR base editing
As a result of its simplicity and high efficiency, the CRISPR-Cas system has been widely used as a genome editing tool. Recently, CRISPR base editors, which consist of deactivated Cas9 (dCas9) or Cas9 nickase (nCas9) linked with a cytidine or a guanine deaminase, have been developed. Base editing tools will be very useful for gene correction because they can produce highly specific DNA substitutions without the introduction of any donor DNA, but dedicated web-based tools to facilitate the use of such tools have not yet been developed.
We present two web tools for base editors, named BE-Designer and BE-Analyzer. BE-Designer provides all possible base editor target sequences in a given input DNA sequence with useful information including potential off-target sites. BE-Analyzer, a tool for assessing base editing outcomes from next generation sequencing (NGS) data, provides information about mutations in a table and interactive graphs. Furthermore, because the tool runs client-side, large amounts of targeted deep sequencing data (< 1 GB) do not need to be uploaded to a server, substantially reducing running time and increasing data security. BE-Designer and BE-Analyzer can be freely accessed at http://www.rgenome.net/be-designer/ and http://www.rgenome.net/be-analyzer/, respectively.
We develop two useful web tools to design target sequence (BE-Designer) and to analyze NGS data from experimental results (BE-Analyzer) for CRISPR base editors.
KeywordsCRISPR Base editing Web-based tool Genome editing NGS analysis
Adenine base editors
Cytosine base editors
Clustered regularly interspaced short palindromic repeats and CRISPR associated
DNA double-stranded breaks
Next generation sequencing
Non-homologous end joining
tRNA adenine deaminase
CRISPR-Cas (clustered regularly interspaced short palindromic repeats and CRISPR associated), an immune system in bacteria and archaea that targets nucleic acids of viruses and plasmids, is now widely used as a genome editing tool because of its convenience and high efficiency [1, 2, 3, 4, 5]. The most popular endonuclease, type II CRISPR-Cas9, makes DNA double-stranded breaks (DSBs) at a desired site with the help of its single-guide RNA (sgRNA) [6, 7, 8]. The DSBs provoke the cell’s own repair systems: error-prone non-homologous end joining (NHEJ) and error-free homology-directed repair (HDR), resulting in gene knock-out and knock-in (or gene correction), respectively. However, it is relatively difficult to induce gene corrections such as one nucleotide substitutions because HDR occurs rarely in mammalian cells compared to NHEJ . Furthermore, Cas9 can frequently induce DSBs at undesired sites with sequences similar to that of the sgRNA [10, 11].
Recently, CRISPR-mediated base editing tools have been developed. These tools enable the direct conversion of one nucleotide to another without producing DSBs in the target sequence and without the introduction of donor DNA templates. The initial base editors (named BEs), composed of dCas9  or nCas9  linked to a cytidine deaminase such as APOBEC1 (apolipoprotein B editing complex 1)  or AID (activation-induced deaminase) , substitute C for T. Later, adenine base editors (ABEs) were constructed by using tRNA adenine deaminase (TadA), evolved to enable the direct conversion of A to G in DNA . Because of their ability to make highly specific DNA substitutions, these base editing tools will be very useful for gene correction [17, 18, 19, 20, 21, 22], but to the best of our knowledge, a user-friendly and freely-available web-based tool for their design and analysis has not yet been developed.
BE-Designer is a sgRNA designing tool for CRISPR base editors. BE-Designer rapidly provides a list of all possible sgRNA sequences from a given input DNA sequence along with useful information: possible editable sequences in a target window, relative target positions, GC content, and potential off-target sites. Basically, the interface of BE-Designer was developed using Django as a backend program.
Input panels in BE-designer
Selection of sgRNAs
Within a given DNA sequence, BE-Designer finds all possible target sites based on input parameters; in the base editing window, target nucleotides are highlighted in red, and their relative position and GC content are indicated. BE-Designer then invokes Cas-OFFinder  to search throughout the entire genome of interest for possible off-target sequences that differ by up to 2 nucleotides from the on-target sequences (Additional file 1: Figure S1).
Due to its high sensitivity and precision, targeted deep sequencing is the best method for assessing the results of base editing. BE-Analyzer accepts targeted deep-sequencing data and analyzes them to calculate base conversion ratios. In addition to the interactive table and graphs showing the results, BE-Analyzer also provides a full list of all query sequences aligned to a given wild-type (WT) sequence, so that users can confirm mutation patterns manually. BE-Analyzer wholly runs on a client-side web browser so that there is no need to upload very large NGS datasets (< 1 GB) to a server, reducing a time-consuming step in genome editing analysis. The BE-Analyzer interface was also developed using Django as a backend program. The core algorithm of BE-Analyzer was written in C++ and then trans-compiled to WebAssembly with Emscripten (http://kripken.github.io/emscripten-site/).
Input panels in BE-analyzer
To analyze query sequences in NGS data, BE-Analyzer requires basic information: a full WT sequence for reference, the type of base editor, the desired base editing window, and the target DNA sequence (Fig. 2b). Previous studies have reported the optimal target window for each base editor. For example, BE3 usually induces base conversion in a region ranging from 13 to 17 nucleotide (nt) upstream of the PAM, and TARGET-AID is most efficient within a region 15 to 19 nt upstream of the PAM. Basically, BE-Analyzer provides the optimal default values with reference to previous studies, but users can freely revise the value manually. On the other hand, it has been reported that base editors can introduce substitutions outside of the DNA target sequences at a low frequency . Therefore, BE-Analyzer is implemented to allow additional flanking windows on each side of the target for analysis by the use of a relevant parameter.
Analysis of NGS data
From uploaded NGS data, BE-Analyzer first defines 15-nt indicator sequences on both sides of the given reference sequence; only identified queries that have both indicator sequences, with ≤1 nt mismatches, are collected. Then, BE-Analyzer counts the recurrent frequency of each sequence and sorts queries in descending order. In this procedure, sequences with frequencies below the minimum are discarded. Each sequence is aligned to the reference sequence with EMBOSS needle (https://www.ebi.ac.uk/Tools/psa/emboss_needle/) (Additional file 1: Figure S1). As a result, the aligned sequences are classified into four different groups based on the presence of a hyphen (−). If hyphens are found in the reference sequence or query, the query is classified as an insertion or deletion by a comparison of the number of hyphens in the two sequences. If hyphens (inserted or deleted sequences) are not found in a given target window including the additional flanking regions, the query is referred as a WT sequence . Otherwise, the queries that contain a few mismatched nucleotides in the given target window are classified as substitutions (Additional file 1: Figure S2).
For base editing, it is crucial to know how the mutation of one or a few nucleotides changes the amino acid sequence. To address this issue, BE-Analyzer provides the expected amino acid sequences for three different reading frames, so that users can select among three possible start positions (Fig. 3b). For each nucleotide, BE-Analyzer displays the nucleotide mutation rate in detail, highlighted with a color gradient.
Although cytidine deaminases mainly introduce C to T transitions in the base editing window, C to A or G transitions may also occur in flanking regions with low probability. Thus, BE-Analyzer shows the substitution rate at each site in the flanking windows and the C to D transition pattern in the target windows (Fig. 3c). In the C to D substitution graph, each transition pattern is presented with its percentile rate, and the type of transition indicated by color (red-black-green). Optionally, if users previously uploaded data from a CRISPR-untreated control, BE-Analyzer displays the substitution rate at each of those sites in the negative direction. Furthermore, for users’ convenience, BE-Analyzer shows substitution patterns within the flanking windows with a heat map, which enables visualization of the dominant substitution patterns as well as background patterns.
At the bottom of the results page, a list of categorized sequence reads aligned to the reference sequence is presented (Fig. 3d). Users can confirm all filtered sequences from the input data in this table and can also save the results by clicking the ‘Download Data’ button.
Comparison between BE-Designer and a Benchling’s designing tool
BE (C to T)
ABE (A to G)
Base editing window
Flexible (13 ~ 20)
Provided organism types
Provided CRISPR variants
Predicted amino acids information
Guide RNA length
Flexible (15 ~ 25)
List + Score
BE-Analyzer is another web tool for instant assessment of deep sequencing data obtained after treatment with base editors. BE-Analyzer instantly analyzes deep sequencing data at a client-side web browser and displays the results using interactive tables and graphs for users’ convenience. Useful information, including the ratio of intended conversions, transition patterns, and sequence alignments, is provided so that users can easily infer how frequently and where intended or unwanted substitutive mutations are generated.
We thank Dr. M. Schlesner at DKFZ for helpful discussion.
This work was supported by National Research Foundation of Korea (NRF) Grants (no. 2017M3A9G8084539 and 2018M3A9H3022412), Next Generation BioGreen 21 Program grant no. PJ01319301, Technology Innovation Program funded by the Ministry of Trade, Industry and Energy (no. 20000158), and Korea Healthcare technology R&D Project grant no. HI16C1012 to S.B.
Availability of data and materials
Example NGS data are freely accessible from the web site (http://www.rgenome.net/be-analyzer/example).
GHH, JP, SB conceived this project. GHH, JP, EY constructed the web tools reported in this study. KL, SK, JY, STK gave critical comments on the web panel. RE, JSK, SB supervised the research. GHH, JP, SB wrote the manuscript with the help of others. All authors read and approved the final version of the manuscript.
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
- 4.Doudna JA, Charpentier E. The new frontier of genome engineering with CRISPR-Cas9. Science. 2014;346:–1258096.Google Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.