A straightforward approach for bioorthogonal labeling of proteins and organelles in live mammalian cells, using a short peptide tag
In the high-resolution microscopy era, genetic code expansion (GCE)-based bioorthogonal labeling offers an elegant way for direct labeling of proteins in live cells with fluorescent dyes. This labeling approach is currently not broadly used in live-cell applications, partly because it needs to be adjusted to the specific protein under study.
We present a generic, 14-residue long, N-terminal tag for GCE-based labeling of proteins in live mammalian cells. Using this tag, we generated a library of GCE-based organelle markers, demonstrating the applicability of the tag for labeling a plethora of proteins and organelles. Finally, we show that the HA epitope, used as a backbone in our tag, may be substituted with other epitopes and, in some cases, can be completely removed, reducing the tag length to 5 residues.
The GCE-tag presented here offers a powerful, easy-to-implement tool for live-cell labeling of cellular proteins with small and bright probes.
Tracking the dynamics of proteins and organelles in live cells is key to understanding their functions. For this, fluorescent protein (e.g., GFP) or self-labeling protein (e.g., Halo-Tag) tags are routinely attached to proteins in cells . While these tags are vigorous and easy to implement, they are large and bulky (e.g., GFP, 27 kDa; Halo-tag, 33 kDa), such that their attachment could affect the dynamics and function of the protein under study. Using genetic code expansion (GCE) and bioorthogonal chemistry, it is now possible to attach fluorescent dyes (Fl-dyes) to specific protein residues, thereby allowing direct labeling of proteins in live cells with Fl-dyes [1, 2, 3]. Indeed, this approach has been applied, in recent years, for fluorescent labeling of extra- and intracellular proteins [4, 5, 6, 7, 8, 9, 10].
In GCE-based labeling, a non-canonical amino acid (ncAA) carrying a functional group is incorporated into the sequence of a protein in response to an in-frame amber stop codon (TAG), via an orthogonal tRNA/tRNA-synthetase pair (reviewed in [11, 12]). Labeling is then carried out by a rapid and specific bioorthogonal reaction between the functional group and the Fl-dye [2, 4, 8, 9, 13, 14]. Successful labeling hence relies on the exogenous expression of an orthogonal tRNA/tRNA-synthetase pair and a protein of interest (bearing a ncAA) at sufficient levels to allow effective labeling.
The ncAA (and consequently the Fl-dye) can, in theory, be incorporated anywhere in the protein sequence. In practice, however, finding a suitable labeling site can be laborious and time-consuming for several reasons. First, prior knowledge or functional assays are necessary to ensure that the insertion of the ncAA at a specific position does not affect protein structure and function [4, 5, 6, 7, 10]. Second, the efficiency of ncAA incorporation varies at different locations in the protein with no guidelines for the preferred sequence context having been reported [3, 4, 5, 6, 7, 15]. Notably, low efficiency of ncAA incorporation does not only lead to ineffective labeling but also to the translation of a truncated version of the protein (resulting from the insertion of a premature stop codon), which can be toxic to cells [5, 6, 16, 17]. Third, the ncAA should be incorporated in a position that will allow the functional group to be accessible to the solvent to enable efficient bioorthogonal conjugation with the Fl-dye. All these requirements are protein specific, such that any attempt at labeling via this approach begins with a screen for suitable incorporation sites [2, 3, 4, 6, 7]. Consequently, despite its great potential, GCE-based labeling is presently not widely used in mammalian live-cell imaging studies .
Proteins can potentially be tagged at their N- or C-terminal. For GCE-based labeling, we chose to design an N-terminal tag in order to avoid protein truncations resulting from inefficient incorporation of the ncAA [5, 6, 16, 17]. On the basis of our previous work, potential tags were cloned into a single expression vector, designed to encode the incorporation of the ncAA bicyclo-nonyne Lysine (BCN-Lys) that bioorthogonally reacts with tetrazine-conjugated Fl-dyes (Fig. 1b and Additional file 1: Figure S1a) [6, 18, 19]. Labeling potency was initially assessed using α-tubulin as a benchmark and evaluating microtubule (MT) labeling in live mammalian cells in the presence of Silicon-Rhodamine-Tetrazine (SiR-Tet) [6, 20]. MT labeling obtained upon site-specific incorporation of BCN-Lys in α-tubulin at position 45 (α-tubulin45TAG) was used as a reference, given our earlier demonstration of the efficacy of MT labeling at this site .
To evaluate labeling efficiencies obtained using potential GCE-tags, signal-to-noise ratios (SNRs) were measured in labeled cells based on line intensity profiles, as illustrated in Fig. 2d. SNR values measured in cells expressing any of the α-tubulin versions were significantly higher than the average background levels measured in cells expressing a wild type (WT) version of α-tubulin that is unable to incorporate the ncAA, verifying that labeling is specific (Fig. 2e). Cells expressing α-tubulin tagged with probe 4 exhibited significantly higher SNR compared to probe 3, indicating that HA-GGSG-ncAA is preferable over HA-GS-ncAA for MT labeling. Notably, despite the relatively low expression levels obtained using probe 4, the average SNR values obtained in cells expressing α-tubulin tagged with probe 4 were higher than those obtained in cells expressing α-tubulin45TAG or in cells expressing α-tubulin-GFP (Fig. 2d, e) . Improved SNRs under lower α-tubulin expression levels can result from reducing the fraction of cytosolic (non-polymerized)-labeled tubulin in cells and is preferential for cell physiology. We therefore concluded that HA-GGSG-ncAA-tagged α-tubulin is superior to site-specific ncAA-incorporated α-tubulin for MT labeling. Based on these results we defined the N′ HA-GGSG-ncAA as the minimal tag for GCE-based bioorthogonal labeling of proteins in live mammalian cells (Fig. 2f).
GCE-tagged GFP-SKL was expressed in HEK293 cells, and a specific band at a similar size was observed in an in-gel fluorescence assay upon labeling the cells with SiR-Tet, in the presence of BCN-Lys (Fig. 4a, b). Notably, while WT GFP-SKL was expressed at higher levels compared to GCE-tagged GFP-SKL, no specific labeling was observed for WT GFP-SKL using in-gel fluorescence, strongly indicating that labeling is specifically induced by binding of the Fl-dye to the ncAA in GCE-tagged GFP-SKL. In live cells, remarkable co-localization of GFP and SiR in small puncta, was observed throughout the cell (except in the nucleus) at similar SNRs (Fig. 4c, Pearson’s correlation = 0.86). This was not the case in cells expressing WT GFP-SKL and labeled with SiR, which exhibited an anti-correlation between GFP and SiR puncta (Fig. 4d, e, Pearson’s correlation = 0.07). In these cells, a small population of SiR puncta was observed but these puncta did not co-localize with the GFP puncta observed throughout the cell. These results further indicate that the co-localization observed in cells expressing GCE-tagged GFP-SKL resulted from specific SiR labeling of the protein.
The organization of the fluorescence signal into discrete puncta in SKL labeled cells allowed us to estimate relative labeling yields by segmenting the population of puncta in one channel and quantifying their intensities in the other channel (Fig. 4f, g). GFP puncta segmented from cells expressing WT GFP-SKL and labeled with SiR-Tet showed low intensity SiR levels that did not scale with GFP intensity levels (Fig. 4f). We therefore reasoned that the SiR levels obtained in these cells represent background SiR fluorescence. In cells expressing GCE-tagged GFP-SKL, the majority of segmented GFP puncta (~ 95%) had higher SiR intensity levels than those measured in cells expressing WT GFP-SKL. Moreover, intensity levels in the SiR channel scaled with GFP intensities, indicating that SiR fluorescence can be used for quantifying the relative levels of GFP-SKL in single peroxisomes. Segmenting the SiR-positive puncta in cells expressing GCE-tagged GFP-SKL globally resulted in similar behavior, with a small population (less than 5%) of SiR puncta that appeared GFP negative (Fig. 4g). In cells expressing WT GFP-SKL, a small population of SiR puncta was segmented. These puncta had low intensity levels in both GFP and SiR channels compared to SiR puncta segmented in GCE-tagged GFP-SKL expressing cells, indicating that they represent noise. These structures can be easily filtered out based on intensity, using image processing. Taken together, these data indicate that GCE-tag SiR-based labeling has ~ 95% yield compared to GFP and that it can be used for quantitative live-cell imaging of peroxisomes.
Successful and specific labeling was also obtained for GCE-tagged extracellular protein EGFR and intracellular ESCRT-III protein CHMP4B as indicated by in-gel fluorescence and live-cell imaging (Additional file 1: Figure S5). For EGFR labeling, the GCE-tag was inserted between the signal peptide and the protein coding sequence (Additional file 1: Figure S5c, d), indicating that the GCE-tag is not exclusive for the N-terminal. Thus, besides its use in organelle labeling, the GCE-tag can be used for labeling intra- and extracellular proteins in live cells.
Additional file 2: Movie S1. Microtubule dynamics in cells labeled with GCE-tag-α-tubulin. COS7 cells expressing GCE-tag-α-tubulin and labeled with SiR-Tet were recorded at 4 s intervals. Shown are maximum intensity projections of 3 z-slices taken from a representative cell. Scale-bar: 10 μm.
Additional file 4: Movie S3. Peroxisome dynamics in cells labeled with GCE-tag-GFP-SKL. COS7 cells expressing GCE-tag-GFP-SKL and labeled with SiR-Tet were recorded at 5.3 s intervals. Left panel: 488 (GFP, green) channel, middle panel: 640 (SiR, red) channel. Shown are maximum intensity projections of 30 z-slices taken from a representative cell. Scale-bar: 10 μm.
Additional file 8: Movie S7. Visualizing ER dynamics by applying FRAP to GCE-tag-ERcb5TM labeled ER. COS7 cells expressing GCE-tag-ERcb5TM labeled with TAMTA-Tet were imaged for 2 min with 2 s intervals. Photobleaching was performed after 4 baseline timepoints (12 s). Shown are maximum intensity projections of 3 z-slices taken from a representative cell. Scale-bar: 10 μm.
Last, we tested how removing the epitope sequence from the GCE-tag will affect labeling (Additional file 1: Figure S1i). In other words, is a sequence encoding GGSG linker followed by a TAG is sufficient for GCE-based bioorthogonal labeling of proteins? Almost no specific labeling was obtained upon tagging α-tubulin with a GCE-tag that lacks the HA sequence both by in-gel fluorescence and live-cell imaging (Fig. 8d bottom panel and Additional file 1: Figure S6f). For peroxisomes and exosomes, specific labeling was obtained, albeit at reduced levels (Fig. 8b, c bottom panels). In peroxisomes, the reduced labeling was consistent with the reduced expression of the protein observed by western blot (Fig. 8a, b bottom panel; Pearson’s correlations, 0.76). Reduced specific labeling was also observed for Exo70 using in-gel fluorescence (Additional file 1: Figure S6e), raising the possibility that the reduced labeling observed in cells is a result of failure to incorporate the ncAA during translation. Therefore, in specific cases in which the length of the tag is critical for preserving the function of the protein, the GCE-tag may be reduced to as few as five residues. Yet, this comes at the price of labeling efficiency and thus should be tested on a case-by-case basis.
In this work, we present a generic, small tag for labeling proteins in live mammalian cells with Fl-dyes through GCE and bioorthogonal chemistry, using a single expression vector. Efficient and specific labeling with Fl-dyes was observed for various intracellular structures and compartments, including MTs, PM, exosomes, lysosomes, MVBs, peroxisomes, and ER using appropriate tag-bearing markers, and for extracellular and cytosolic proteins that carry the tag. By adding 14 residues to the N-terminal of proteins, we minimized the need for prior knowledge on the protein and bypassed the screening step currently associated with the technique. Moreover, by inserting the TAG codon at the beginning of the coding sequence (rather than in the middle of the sequence), we avoided the expression of truncated proteins. As labeling efficiencies obtained using the GCE-tag were either comparable or superior to site-specific labeling, the GCE-tag presented here provide an attractive, easy-to-implement, alternative for bioorthogonal labeling of proteins modified to carry a ncAA.
The GCE-tag reported here is considerably smaller than Fl-protein tags and self-labeling protein tags (GCE-tag, ~ 1.5 kDa; GFP, 27 kDa; SNAP-tag, 19 kDa) [1, 2]. The only tag with a comparable size to that of the GCE-tag is the 12 amino-acid-long FlAsH tag . However, labeling proteins with FlAsH tags rely on biarsenical-functionalized fluorescent dyes, which exhibit unspecific binding to cellular membranes and are toxic to cells . Therefore, although adding a tag is not as elegant as site-specific labeling, the GCE-tag stands as one of the shortest tags available for fluorescence labeling of proteins and organelles in live cells.
Using GCE-based bioorthogonal labeling is known to suffer from relatively high noise levels compared to fluorescent protein tag-based approaches. The main noise sources are non-specific binding of the tetrazine-conjugated fluorescent dyes and excess of tRNA that is charged with a ncAA and is free to undergo the bioorthogonal reaction. The latter is the main source for the unspecific staining observed in the nucleolus. To minimize these noise factors, we previously performed careful optimization of all assay components [6, 19]. Using our optimized labeling assay conditions, GCE-tag-based labeling was comparable to protein tags in terms of labeling efficiencies and SNR values. Given the small size of the tag, which is likely to better preserve the physiological properties of proteins and organelles, we find GCE-tag superior to conventional protein tags for protein labeling.
List of organelle markers labeled via bioorthogonal labeling using the GCE-tag
Successful GCE-based bioorthogonal labeling mainly relies on two steps: expression of a protein carrying the ncAA, and a bioorthogonal reaction between the ncAA and the Fl-dye. We find that protein expression levels do not always correlate with efficient labeling. In some cases, lower expressions of the modified protein resulted in higher SNRs. When labeling cellular structures, such behavior can result from expressing the modified proteins at levels, which are higher than the capacity of the cellular structure. In such cases, the excess of overexpressed protein will remain in the cytosol, leading to increased background levels and consequently lower SNRs. Maintaining low expression levels of the modified protein is preferential for cell physiology. Therefore, upon applying GCE-based labeling, we recommend testing both parameters and choosing the lowest expression conditions that provide the highest SNRs.
The use of the GCE-tag reported here expands the plethora of labeling options for proteins in live cells. As with other labeling approaches, GCE-tag-based bioorthogonal labeling can be combined with conventional labeling approaches for multicolor labeling and can be tailored for super-resolution microscopy . Moreover, the approach can potentially be expanded to labeling endogenous proteins using genome-editing techniques and can be further applied to recently reported GCE-modified tissues and organisms [27, 28, 29, 30, 31]. Additionally, a variety of ncAAs that carries different chemical functionalities has been genetically encoded in mammalian cells [8, 32, 33, 34, 35]. The GCE-tag can be further used for incorporating these ncAAs, expanding its use in labeling applications and beyond.
Materials and methods
COS7 and HEK293T cells were kind gifts from Marcelo Ehrlich (Tel Aviv University). Cells were grown in Dulbecco’s modified Eagle’s medium (DMEM; Life Technologies, Carlsbad, CA) supplemented with 10% fetal bovine serum (FBS), 2 mM glutamine, 10,000 U/ml penicillin, and 10 mg/ml streptomycin.
Plasmids and constructs
Tags were sub-cloned into the single expression vector pBUD-BCNK-RS that carries pylT encoding for tRNACUAPyl, and Pyrrolysyl-tRNA synthetase  (Additional file 1: Figure S1), using NotI/KpnI restriction sites. Sequences encoding organelle markers or proteins of interest were then inserted in-frame using the KpnI/XhoI restriction sites at pBUD-BCNK-RS. All constructs were sequenced before use.
Incorporation of the ncAA to proteins in cells
Assay was performed according to our previously optimized protocol . Twenty-four hours before transfection, cells were plated at 20% confluency using the following dishes: live cell imaging, 4-well chamber slide (Ibidi, Martinsried, Germany); western blot, 12-well plate (NUNC, Rochester, NY); and immunostaining, #1.0 coverslips (Menzel, Braunschweig, Germany). Cells were transfected with pBUD-BCNK-RS plasmids carrying different tags and organelle markers (Additional file 1: Figure S1, Table S1) using Lipofectamine 2000 (Life Technologies, Carlsbad, CA) according to the manufacturer’s protocol, and incubated for 48 h in the presence of the ncAA BCN-Lys (0.5 mM, Synaffix, Oss, Netherlands) in growth media supplemented with 100 μM ascorbic acid (Sigma Aldrich, Israel).
Forty-eight hours post-transfection, cells were washed with fresh medium (3 × quick wash followed by 3 × 30 min wash) at 37 °C, incubated with SiR-Tet (1–2 μM, 1 h, Spirochrome, Stein am Rhein, Switzerland) or TAMRA-Tet (2 μM, 1 h, Jena BioScience, Germany), and washed again with fresh medium (3 × quick wash and 3 × 30 min wash) at 37 °C.
Forty-eight hours post-transfection, cells were labeled with the appropriate Fl-dye as described above and in Table 1. After labeling, cells were collected, centrifuged at 200×g for 5 min, and washed with PBS twice. Then, cells were lysed using RIPA lysis buffer (150 mM NaCl, 1% NP-40, 0.5% deoxycholate, 0.1% SDS, 50 mM Tris [pH 8.0]) supplemented with complete protease inhibitor for 30 min at 4 °C. Total protein concentrations were measured with BCA Protein Assay Kit (Pierce Biotechnology), and equal total protein amounts were loaded on an SDS-PAGE gel. Gels were imaged at the appropriate wavelength using a Typhoon FLA 7000 biomolecular imager (GE Healthcare, PA, USA) to reveal specific Fl-dye labeling.
Cells were harvested 48 h post-transfection and lysed using RIPA lysis buffer supplemented with complete protease inhibitor for 30 min at 4 °C. Total protein concentrations were measured with BCA Protein Assay Kit, and equal total protein amounts were loaded and were subjected to western blot analysis using the following primary antibodies: rabbit anti-HA (1:4000, catalog number G166, Applied Biological Materials, Richmond, Canada, RRID, AB_2813867), mouse anti-GAPDH (1:1000, catalog number G041, Applied Biological Materials, RRID, AB_2813868), mouse anti-GFP (1:1000, catalog number G096, Applied Biological Materials, RRID, AB_2813869), and rabbit or mouse-peroxidase secondary antibodies (1:10,000, catalog number 715-035-151 or 711-035-152, Jackson ImmunoResearch, West Grove, PA, RRID, AB_2340771 or AB_10015282).
Cells were fixed 48 h post-transfection with 4% paraformaldehyde (PFA) and co-stained with rabbit anti-HA (1:500) and mouse anti-CD63 (1:200, catalog number ab59479 Abcam, Cambridge, MA, RRID, AB_940915) primary antibodies and with Alexa Fluor 488 anti-rabbit and Alexa Fluor 594 anti-mouse (1:500, catalog number A21206 or A21203 (RRID), Life Technologies; RRID: AB_2535792 or AB_2535789) secondary antibodies. Cells were mounted with Fluoromount-G (SouthernBiotech, Birmingham, AL).
Live cell imaging
Cells were imaged on a fully incubated confocal spinning-disk microscope at 37 °C (Marianas; Intelligent Imaging, Denver, CO) using a × 63 oil objective (numerical aperture 1.4); depending on the protein expression levels of each cell, the specified Fl-dye or FP was excited at laser powers between 5 and 40% for 100–350 ms, and recorded on an electron-multiplying charge-coupled device camera (pixel size, 0.079 μm; Evolve; Photometrics, Tucson, AZ). A total of 3–30 confocal slices were captured for each image.
Analysis of image sets was performed using SlideBook, version 6 (Intelligent Imaging, Denver, CO). To improve visibility, some of the images were subjected to unsharp Mask, Gaussian filter, or both. Care was taken to apply similar filtering for all channels acquired in the same image and for all images that represent the same cellular structure, to allow for comparison. For line intensity analysis, the intensity levels along a line that crosses the cellular structure were measured. Intensity levels were normalized to 1 after background subtraction and plotted. SNRs for each cellular structure were calculated by dividing the maximum intensity by the minimum intensity obtained in each line intensity plot. Intensity levels shown in Fig. 6g were calculated by dividing the mean intensity value obtained inside the cell by the mean intensity value measured outside the cell. SNR and intensity measurements for each cellular structure were taken from at least ten different cells from three independent experiments. Plots presented in Fig. 4f, g were generated by segmenting puncta structures using an intensity mask in SlideBook, based on the GFP channel (f) or the SiR channel (g) and plotting the intensity in one channel against the intensity in the other channel. All values were subjected to background subtraction. Pearson’s correlation analysis was performed for the indicated regions of interest, using the co-localization plugin in SlideBook. Low intensity levels were filtered out to avoid noise measurements (indicated in green and red lines in plots). Analysis was performed after background subtraction.
FRAP experiments were performed using the same confocal spinning-disk microscope setup, on cells expressing GCE-tag-ERcb5TM and labeled with TAMRA-Tet. After four baseline time points, bleaching was carried out on a selected ROI using 100 ms exposure to a 405 nm laser. Recovery after bleaching was recorded for 1 min with 1.8 s intervals. FRAP analysis fitting and unintentional bleaching corrections were performed using SlideBook.
Statistical analysis was calculated in GraphPad Prism version 5.00 for Windows (La Jolla, CA, USA). Data are shown as means ± SEM. Statistical significance was determined by ANOVA or t test analysis: ***p < 0.0001, **p = 0.001–0.01, *p = 0.01–0.05, and ns p > 0.05.
We thank Peter Kim (University of Toronto) and Jennifer Lippincott Schwartz (Janelia Research campus) for contributing plasmids for fluorescent protein-based organelle markers. We also thank all members of the Elia lab for critical reading of the manuscript.
IS and NE designed and analyzed all the experiments. IS performed most experiments. DN performed the biochemical experiments and provided technical support. AK and AA performed the experiments. EA advised on the GCE. NE, EA, and DN initiated the project. NE wrote the manuscript. All authors read and approved the final manuscript.
The project leading to this application has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research.
Ethics approval and consent to participate
No ethics approval was required for this study.
The authors declare that they have no competing interests.
- 9.Lang K, Davis L, Wallace S, Mahesh M, Cox DJ, Blackman ML, Fox JM, Chin JW. Genetic encoding of bicyclononynes and trans-cyclooctenes for site-specific protein labeling in vitro and in live mammalian cells via rapid fluorogenic Diels-Alder reactions. J Am Chem Soc. 2012;134(25):10317–20.CrossRefGoogle Scholar
- 24.Snapp EL, Altan N, Lippincott-Schwartz J. Measuring protein mobility by photobleaching GFP chimeras in living cells. Curr Prot Cell Biol. 2003;Chapter 21:Unit 21 21.Google Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.