High-Resolution 3D Light Microscopy with STED and RESOLFT
This chapter discusses the simple yet powerful ideas which have allowed to break the diffraction resolution limit of lens-based optical microscopy. The basic principles and standard implementations of STED (stimulated emission depletion) and RESOLFT (reversible saturable/switchable optical linear (fluorescence) transitions) microscopy are introduced, followed by selected highlights of recent advances, including MINFLUX (minimal photon fluxes) nanoscopy with molecule-size (~1 nm) resolution.
KeywordsFluorescence imaging Superresolution Nanoscopy Molecular states Diffraction-unlimited resolution Far-field optics Neuroscience Cell biology Stimulated emission Switchable proteins
We discuss the simple yet powerful ideas which have allowed to break the diffraction resolution limit of lens-based optical microscopy. The basic principles and standard implementations of STED (stimulated emission depletion) and RESOLFT (reversible saturable/switchable optical linear (fluorescence) transitions) microscopy are introduced, followed by selected highlights of recent advances, including MINFLUX (minimal photon fluxes) nanoscopy with molecule-size (~1 nm) resolution.
We are all familiar with the sayings “a picture is worth a thousand words” and “seeing is believing”. Not only do they apply to our daily lives, but certainly also to the natural sciences. Therefore, it is probably not by chance that the historical beginning of modern natural sciences very much coincides with the invention of light microscopy. With the light microscope mankind was able to see for the first time that every living being consists of cells as basic units of structure and function; bacteria were discovered with the light microscope, and also mitochondria as examples of subcellular organelles.
However, we learned in high school that the resolution of a light microscope is limited to about half the wavelength of the light [1, 2, 3, 4], which typically amounts to about 200–350 nm. If we want to see details of smaller things, such as viruses for example, we have to resort to electron microscopy. Electron microscopy has achieved a much higher spatial resolution—tenfold, hundred-fold or even thousand-fold higher; in fact, down to the size of a single molecule. Therefore the question comes up: Why do we care for the light microscope and its spatial resolution, now that we have the electron microscope?
The first reason is that light microscopy is the only way in which we can look inside a living cell, or even living tissues, in three dimensions; it is minimally invasive. But, there is another reason. When we look into a cell, we are usually interested in a certain species of proteins or other biomolecules, and we have to make this species distinct from the rest—we have to “highlight” those proteins . This is because, to light or to electrons, all the proteins look the same.
In light microscopy this “highlighting” is readily feasible by attaching a fluorescent molecule to the biomolecule of interest . Importantly, a fluorescent molecule  has, among others, two fundamental states: a ground state and an excited fluorescent state with higher energy. If we shine light of a suitable wavelength on it, for example green light, it can absorb a green photon so that the molecule is raised from its ground state to the excited state. Right afterwards the atoms of the molecule wiggle a bit—that is why the molecules have vibrational sub-states—but within a few nanoseconds, the molecule relaxes back to the ground state by emitting a fluorescence photon.
Because some of the energy of the absorbed (green) photon is lost in the wiggling of the atoms, the fluorescence photon is red-shifted in wavelength. This is actually very convenient, because we can now easily separate the fluorescence from the excitation light, the light with which the cell is illuminated. This shift in wavelength makes fluorescence microscopy extremely sensitive. In fact, it can be so sensitive that one can detect a single molecule, as has been discovered through the works of W. E. Moerner , of Michel Orrit  and their co-workers.
However, if a second molecule, a third molecule, a fourth molecule, a fifth molecule and so on are positioned closer together than about 200–350 nm, we cannot tell them apart, because they appear in the microscope as a single blur. Therefore, it is important to keep in mind that resolution is about telling features apart; it is about distinguishing them. Resolution must not be confused with sensitivity of detection, because it is about seeing different features as separate entities.
1.1 Breaking the Diffraction Barrier in the Far-field Fluorescence Microscope
Now it is easy to appreciate that a lot of information is lost if we look into a cell with a fluorescence microscope: anything that is below the scale of 200 nm appears blurred. Consequently, if one manages to come up with a focusing (far-field) fluorescence microscope which has a much higher spatial resolution, this would have a tremendous impact in the life sciences and beyond.
The person who fully appreciated that diffraction poses a serious limit on the resolution was Ernst Abbe, who lived at the end of the nineteenth century and who coined this “diffraction barrier” in an equation which has been named after him . It says that, in order to be separable, two features of the same kind have to be further apart than the wavelength divided by twice the numerical aperture of the objective lens. One can find this equation in most textbooks of physics or optics, and also in textbooks of biochemistry and molecular biology, due to the enormous relevance of light microscopy in these fields. Abbe’s equation is also found on a memorial which was erected in Jena, Germany, where Ernst Abbe lived and worked, and there it is written in stone. This is what scientists believed throughout the twentieth century. However, not only did they believe it, it also was a fact. For example, if one wanted to look at features of the cellular cytoskeleton in the twentieth century , this was the type of resolution obtained.
This equation was coined in 1873. So much new physics emerged during the twentieth century and so many new phenomena were discovered. There should be phenomena—at least one—that could be utilized to overcome the diffraction barrier in a light microscope operating with propagating beams of light and regular lenses. S.W.H. understood that it won’t work just by changing the way the light is propagating, the way the light is focused. [Actually he had looked into that; it led him to the invention of the 4Pi microscope [11, 12], which improved the axial resolution, but did not overcome Abbe’s barrier.] S.W.H. was convinced that a potential solution must have something to do with the major discoveries of the twentieth century: quantum mechanics, molecules, molecular states and so on.
Therefore, he started to check his textbooks again in order to find something that could be used to overcome the diffraction barrier in a light-focusing microscope. In simple terms, the idea was to check out the spectroscopic properties of fluorophores, their state transitions, and so on; maybe there is one that can be used for the purpose of making Abbe’s barrier obsolete. Alternatively, there could be a quantum-optical effect whose potential has not been realized, simply because nobody thought about overcoming the diffraction barrier .
With these ideas in mind, one day when he was not very far from [Stockholm] in Åbo/Turku, just across the Gulf of Bothnia, on a Saturday morning, S.W.H. browsed a textbook on quantum optics  and stumbled across a page that dealt with stimulated emission. All of a sudden he was electrified. Why?
To reiterate, the problem is that the lens focuses the light in space, but not more tightly than 200 nm. All the features within the 200-nm region are simultaneously flooded with excitation light. This cannot be changed, at least not when using conventional optical lenses. But perhaps we can change the fact that all the features which are flooded with (excitation) light are, in the end, capable of sending light (back) to the detector. If we manage to keep some of the molecules dark—to be precise, put them in a non-signaling state in which they are not able to send light to the detector—we will see only the molecules that can, i.e. those in the bright state. Hence, by registering bright-state molecules as opposed to dark-state molecules, we can tell molecules apart. So the idea was to keep a fraction of the molecules residing in the same diffraction area in a dark state, for the period of time in which the molecules residing in this area are detected. In any case, keep in mind: the state (transition) is the key to making features distinct. And resolution is about discerning features.
The physical condition for achieving this is that the wavelength of the stimulating beam is longer (Fig. 1.3c). The photons of the stimulating beam have a lower energy, so as not to excite molecules but to stimulate the molecules going from the excited state back down to the ground state. There is another condition, however: we have to ensure that there is indeed a red photon at the molecule which pushes the molecule down. We emphasize this because most red-shifted photons pass by the molecules, as there is a finite interaction probability of the photon with a molecule, i.e. a finite cross-section of interaction. But if one applies a stimulating light intensity at or above a certain threshold, one can be sure that there is at least one photon which “kicks” the molecule down to the ground state, thus making it instantly assume the dark state.
Figure 1.3d shows the probability of the molecule to assume the bright state, the S1, in the presence of the red-shifted beam transferring the molecule to the dark ground state. Beyond a certain threshold intensity, Is, the molecule is clearly turned “off”. One can apply basically any intensity of green light. Yet, the molecule will not be able to occupy the bright state and thus not signal. Now the approach is clear: we simply modify this red beam to have a ring shape in the focal plane [19, 24], such that it does not carry any intensity at the centre. Thus, we can turn off the fluorescence ability of the molecules everywhere but at the centre. The ring or “doughnut” becomes weaker and weaker towards the centre, where it is ideally of zero intensity. There, at the centre, we will not be able to turn the molecules off, because there is no STED light, or it is much too weak.
Now let’s have a look at the sample (Fig. 1.3b) and let us assume that we want to see just the fibre in the middle. Therefore, we have to turn off the fibre to its left and the one to its right. What do we do? We cannot make the ring smaller, as it is also limited by diffraction. Abbe would say: “Making narrower rings of light is not possible due to diffraction.” But we do not have to do that. Rather, we simply have to “shut off” the molecules of the fibres that we do not want to see, that is, we make their molecules dwell in a dark state, until we have recorded the signal from that area. Obviously, the key lies in the preparation of the states. So what do we do? We make the beam strong enough so that the molecules even very close to the centre of the ring are turned “off” because they are effectively confined to the ground state all the time. This is because, even close to the centre of the ring, the intensity is beyond the threshold Is in absolute terms.
Now we succeed in separation: only in the position of the doughnut centre are the molecules allowed to emit, and we can therefore separate this signal from the signal of the neighbouring fibres. And now we can acquire images with subdiffraction resolution: we can move the beams across the specimen and separate each fibre from the other, because their molecules are forced to emit at different points in time. We play an “on/off game”. Within the much wider excitation region, only a subset of molecules that are at the centre of the doughnut ring are allowed to emit at any given point in time. All the others around them are effectively kept in the dark ground state. Whenever one makes a check which state they are in, one will nearly always find those molecules in the ground state.
Of course, a strength of light microscopy is that we can image living cells by video-rate recording with STED microscopy. An example are synaptic vesicles in the axon of a living neuron . One can directly see how they move about and we can study their dynamics and their fate over time. It is clearly important to be able to image living cells.
In the situation depicted in Fig. 1.7b, we cannot separate two of the close-by molecules because both are allowed to emit at the same time. But let us make the beam a bit stronger, so that only one molecule “fits in” the region in which the molecules are allowed to be “on”. Now the resolution limit is apparent: it is the size of a molecule, because a molecule is the smallest entity one can separate. After all, we separate features by preparing their molecules in two different states, and so it must be the molecule which is the limit of spatial resolution. When two molecules come very close together, we can separate them because at the time one of them is emitting, the other one is “off” and vice versa [28, 30, 31, 32].
Does one typically obtain molecular spatial resolution, and what about in a cell? For STED microscopy right now, the standard of resolution is between 20 and 40 nm depending on the fluorophore, and depending on the fluorophore’s chemical environment . But this is something which is progressing; it is under continuous development. With fluorophores which have close-to-ideal properties and can be turned “on” and “off” as many times as desired, we can do much better, of course.
This may look like a proof-of-principle experiment, and to some extent it is. But it is not just that, there is another reason to perform these experiments [34, 36, 37]. The so-called charged nitrogen vacancies are currently regarded as attractive candidates for quantum computation: as qubits operating at room temperature [38, 39]. They possess a spin state with a very long coherence time and which can be prepared and read out optically. Being less than a nanometer in size, they can sense magnetic fields at the nanoscale [40, 41]. There inherently are nanosensors in there, and STED is perhaps the best way of reading out the state and the magnetic fields at the nanoscale. In the end, this could make STED an interesting candidate perhaps for reading out qubits in a quantum computer, or who knows … Development goes on!
Indeed, it turns out that there is a strong reason for looking into other types of states and state transitions. Consider the state lifetimes (Fig. 1.10). For the basic STED transition, the lifetime of the state, the excited state, is nanoseconds (Fig. 1.10a). For metastable dark states used in methods termed ground state depletion (GSD) microscopy [42, 43, 44] (Fig. 1.10b) the lifetime of the state is microseconds, and for isomerization it is on the order of milliseconds (Fig. 1.10c). Why are these major increases in the utilized state lifetime relevant?
Well, just remember that we separate adjacent features by transferring their fluorescent molecules into two different states. But if the state—one of the states—disappears after a nanosecond, then the difference in states created disappears after a nanosecond. Consequently, one has to hurry up putting in the photons, creating this difference in states, as well as reading it out, before it disappears. But if one has more time—microseconds, milliseconds—one can turn molecules off, read the remaining ones out, turn on, turn off ….; they stay there, because their states are long-lived. One does not have to hurry up putting in the light, and this makes this “separation by states” operational at much lower light levels [28, 42].
To be more formal, the aforementioned intensity threshold Is scales inversely with the lifetime of the states involved (Fig. 1.10e): the longer the lifetime, the smaller is the Is, and the diffraction barrier can be broken using this type of transition at much lower light levels. Is goes down from megawatts (STED), kilowatts (GSD) down to watts per square centimetre for millisecond switching times—a six orders of magnitude range . This makes transitions between long-lived states very interesting, of course. Here in the equation (Fig. 1.10d), Is goes down and with that of course also I goes down because one does not need as many photons per second in order to achieve the same resolution d.
Notwithstanding the somewhat different optical arrangement, the key is the molecular transition. Selecting the right molecular transition determines the parameters of imaging. The imaging performance, including the resolution and the contrast level, as well as other factors, is actually determined by the molecular transition chosen .
Do not separate just by focusing. Separate by molecular states, in the easiest case by “on/off”-states [28, 29, 30, 31]. If separating by molecular states, one can indeed distinguish the features, one can tell the molecules apart even though they reside within the region dictated by diffraction. We can tell, for instance, one molecule apart from its neighbours and discern it (Fig. 1.13, bottom). For this purpose, we have our choice of states that I have introduced already (Fig. 1.10) which we can use to distinguish features within the diffraction region.
In the methods described, STED, RESOLFT and so on, the position of the state—where the molecule is “on”, where the molecule is “off”—is determined by a pattern of light featuring one or more intensity zeros, for example a doughnut. This light pattern clearly determines where the molecule has to be “on” and where it has to be “off”. The coordinates X, Y, Z are tightly controlled by the incident pattern of light and the position(s) of its zero(s). Moving the pattern to the next position X, Y, Z—one knows the position of the occurrence of the “on” and “off” states already. One does not necessarily require many detected photons from the “on” state molecules, because the detected photons are merely indicators of the presence of a feature. The occurrence of the state and its location is fully determined by the incident light pattern.
An interesting insight here is that one needs a bright pattern of emitted light to find out the position just as one needs a bright pattern of incident light in STED/RESOLFT to determine the position of emission. Not surprisingly, one always needs bright patterns of light when it comes to positions, because if one has just a single photon, this alone tells nothing. The photon can go anywhere within the realm of diffraction, there is no way to control where it goes within the diffraction zone. In other words, when dealing with positions, one needs many photons by definition, because this is inherent to diffraction. Many photons are required for defining positions of “on”- and “off”-state-molecules in STED/RESOLFT microscopy, just as many photons are required to find out the position of “on”-state molecules in the stochastic method PALM.
Although the PALM principle can also be implemented on a single diffraction zone only (i.e. using a single focused beam of light), it is usually implemented in a “parallelized” way, i.e. on a larger field of view containing many diffraction zones. PALM parallelization requires that there may be only a single “on”-state molecule within a diffraction zone, i.e. within the distance dictated by the diffraction barrier. However, the position of this molecule is completely random. Therefore, we have to make sure that the “on”-state molecules are far enough apart from each other, so that they are still identifiable as separate molecules. While in (STED/RESOLFT) the position of a certain state is given by the pattern of light falling on the sample, position in PALM is established from the pattern of (fluorescence) light coming out of the sample.
What does I/Is in STED/RESOLFT stand for? Is can be seen as the number of photons that one needs to ensure that there is at least one photon interacting with the molecule, pushing it from one state to the other in order to create the required difference in molecular states. I/Is is, so to speak, the number of photons which really “can do something” at the molecule while most of the others just “pass by”. Similarly, in the PALM concept, the number of photons n in 1/√(n) is the number of those photons that are detected, i.e. that really contribute to revealing the position of the emitting molecule. In other words, in both concepts, to attain a high coordinate precision, one needs many photons that really do something. This analogy very clearly shows the importance of the number photons to achieve coordinate precision in both concepts.
However, in both cases the separation of features is, of course, accomplished by an “on/off”-transition [28, 29, 30, 31]. This is how we make features distinct, how we tell them apart. As a matter of fact, all the super-resolution methods which are in place right now and really useful, achieve molecular distinguishability by transiently placing the molecules that are closer together than the diffraction barrier in two different states for the time period in which they are jointly scrutinized by the detector. “Fluorescent” and “non-fluorescent” is the easiest pair of states to play with, and so this is what has worked out so far.
1.2 Recent Developments: Nanoscopy at the MINimum
DyMIN is a related recording strategy  which minimizes exposure to unduly high intensities except at scanning steps where these intensities are strictly required for resolving features (Fig. 1.20b–d). Like MINFIELD, the DyMIN approach achieves dose reductions by up to orders of magnitude, particularly for relatively sparse fluorophore distributions. Initially demonstrated for STED immunofluorescence imaging, both MINFIELD and DyMIN will be explored for other classes of fluorophores, including the inherently lower-light-level RESOLFT nanoscopy variants with genetically encoded fluorescent proteins. The recently described organic switchable photochromic compounds  will also be further developed as attractive alternatives in this regard. The synergistic combination of two separate fluorophore state transitions in a recent concept termed multiple off-state transitions (MOST) for nanoscopy  has also enabled many more image frames to be captured, at much improved contrasts and with lower STED light dose at a given resolution than for standard STED. Approaches to directly count molecules with STED have also been developed , and can be used to quantify the composition of suitably labeled molecular clusters.
While the experimental developments of the MINFLUX concept are still in the beginnings, it is worth commenting on the fundamental advantage over localization based on the emitted fluorescence alone. As discussed in [72, 73], in PALM/STORM, as in camera-based tracking applications, a molecule’s position is inferred from the maximum of its fluorescence diffraction pattern (back-projected into sample space). The precision of such camera-based localization ideally reaches σcam ≥ σPSF/√N, with σPSF being the standard deviation of the pattern and N the number of fluorescence photons making up the pattern . Note that σcam is thus clearly bounded by the finite fluorescence emission rate, which for currently used fluorophores rarely allows more than a few hundred photon detections per millisecond (<1 MHz). Moreover, emission is frequently interrupted and eventually ceases due to blinking and bleaching. This also keeps the photon emission rate as the limiting factor for the obtainable spatio-temporal resolution. As a result, state-of-the-art single-molecule tracking performance long remained in the tens of nanometer per several tens of millisecond range. Drawing on the basic ideas of the coordinate determination employed in STED/RESOLFT microscopy, the MINFLUX concept addresses these fundamental limitations . By localizing individual emitters with an excitation beam featuring an intensity minimum that is spatially precisely controlled, MINFLUX takes advantage of coordinate targeting for single-molecule localization. The basic steps are illustrated for one spatial dimension in Fig. 1.21. In a typical two-dimensional MINFLUX implementation, the position of a molecule is obtained by placing the minimum of a doughnut-shaped excitation beam at a known set of spatial coordinates in the molecule’s proximity. These coordinates are within a range L in which the molecule is anticipated (Fig. 1.22a). Probing the number of detected photons for each doughnut minimum coordinate yields the molecular position. It is the position at which the doughnut would produce minimal emission, if the excitation intensity minimum were targeted to it directly. As the intensity minimum is ideally a zero, it is the point at which emission is ideally absent. The precision of the position estimate increases with the square root of the total number of detected photons and, more importantly, by decreasing the range L, the spatial scale inserted from the outside into the experiment. For small ranges L for which the intensity minimum is approximated by a quadratic function, the localization precision does not depend on any wavelength and, for the case of no background and perfect doughnut control, the precision σMINFLUX simply scales with L/√N at the center of the investigated range. In other words, the better the coordinates of the excitation minimum match the position of the molecule, the fewer fluorescence detections are needed to reach a given precision. In the conceptual limit where the excitation minimum coincides with the position of the emitter, i.e. L = 0, the emitter position is rendered by vanishing fluorescence detection. This is contrary to conventional centroid-based localization where precision improvements are tightly bound to having increasingly larger numbers of detected photons.
The already demonstrated tracking of fluorophores with substantially sub-millisecond position sampling (Fig. 1.22c) is only the beginning in a quest for highest spatiotemporal capabilities (compare data in Fig. 1.22d) . The inherent confocality should also provide a critical advantage when considering imaging in more dense and three-dimensional specimens, such as brain slices and in-vivo imaging scenarios. With further development of other aspects, including field-of-view enlargement, etc., MINFLUX is bound to transform the limits of what can be observed in cells and molecular assemblies with light. This should most probably impact cell and neurobiology and possibly also structural biology. Moreover, it should be a great tool for studying molecular interactions and intra-macromolecular dynamics in a range that has not been accessible so far.
Substantial portions of the discussion in this chapter have been only slightly modified from the published text of the Nobel Lecture, as delivered by Stefan W. Hell in Stockholm on December 8, 2014 (Copyright The Nobel Foundation, which has granted permission for reuse of the materials.)
- 2.Verdet E. Leçons d’ optique physique, vol. 1. Paris: Victor Masson et fils; 1869.Google Scholar
- 3.Lord Rayleigh. On the theory of optical images, with special reference to the microscope. Philos Mag. 1896;5(42):167–95.Google Scholar
- 4.von Helmholtz H. Die theoretische Grenze für die Leistungsfähigkeit der Mikroskope. Ann Phys Chem. 1874;6:557–84.Google Scholar
- 5.Alberts B, et al. Molecular biology of the cell. 4th ed. New York, NY: Garland Science; 2002.Google Scholar
- 10.Born M, Wolf E. Principles of optics. 7th ed. Cambridge; New York, NY, Melbourne, VIC; Madrid, Cape Town: Cambridge University Press; 2002.Google Scholar
- 14.Loudon R. The quantum theory of light. Oxford: Oxford University Press; 1983.Google Scholar
- 32.Hell SW. Far-field optical nanoscopy. In: Gräslund A, Rigler R, Widengren J, editors. Single molecule spectroscopy in chemistry, physics and biology. Berlin: Springer; 2009. p. 365–98.Google Scholar
- 69.Oracz J, Westphal V, Radzewicz C, Sahl SJ, Hell SW. Photobleaching in STED nanoscopy and its dependence on the photon flux applied for reversible silencing of the fluorophore. Sci Rep. 2017;7:11354.Google Scholar
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.